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Preface 


This is the second part of the earlier publication PRE-UNI- 
VERSITY STATISTICS: PART 1, anda distinctive improvement is 
attempted to make it more exhaustive by elaborating the concepts 
and principles thereby including more number of worked as well as 
self assessment exercises. 

Due to various reasons there has been a wide gap between 
manuscript and the final print and around this time the Pre-Uni- 
versity Class syllabus was also revised and understandably the 
entire format was recast and reorganised. Despite these constraints, 
extensive changes in the format as compared to Part I, effort has 
been taken to see that the contents are exhaustive, comprehensive 
and rigorous and the approach in general has been to present the 
theory with some connection to practical problems. My greatest 
personal debt is to Shri Anand R. Kundaji of M/s Wiley Eastern Ltd., 
who kindled my interest in writing. I am indebted to Shri Murali 
Varadan and ShriM.S. Sejwal of M/s Wiley Eastern Ltd., for their 
constant encouragement and suggestions. Г am indebted to Shri 
M.R. Ramesh, my student and colleague now for his help in review- 
ing the manuscript. I appreciate and express my thanks to most of 
my colleagues and in particular Dr. D. Gopinath for his help in 
preparing the index, Dr. Sridhar and Dr. [Ms] Savithri Ramaiah 
who assisted in the visual aids for the book. I would also like to 
express my gratitude to Shri T.S. Ranga of M/s Kruti for his 
help in typing and proof reading and Shri T.S. Ravichandar of 
M/s Intertec Data Systems for preparing the exponential tables and 
their constant encouragement. 

Finally, I wish to express my sincere thanks to Shri V.R. Shyam, 
leading Kannada novelist and journalist for his guidance and help. | 

I am grateful to my wife Smt. Vidya and my children Chi. Avinash 
and Kum. Archana for their random checks and criticism and also 
for their patience despite my long hours on this book. My thanks 
are to Chi. Aparna, Anuradha and Shravan for their help. 
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I am to be blamed for any deficiencies in this book and suggestions 
in this regard are most welcome. 


August 5, 1988 A.N. VISWANATH 
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1. Probability 


1.1. BASIC MATHEMATICS 


Elementary Set Theory 

It is usual and convenient to think about a group of objects, like 
identification of many groups of things or objects like group of 
teachers, collection of cassettes, a crowd hearing a public speech 
and so forth. Each such group is termed and is equivalent to set 
of objects. Hence a set is a collection of distinguishable objects of 
our thought. Having mentioned about a set, it is necessary to find 
out the members known as elements of a set. It should be possible 
to say whether an element belongs to a set or not. 

Normally a set is defined by listing its objects by enclosing flower 
brackets, the members of a set, and by convention capital letters 
of an alphabet are used to denote a set, whereas small letters are used 
to denote the elements. 

The set of even numbers (Е) can be denoted by 


E — (2, 4, 6, 8, 10, 12) 
and the set of months having 30 days only in the calender 
C — (April, June, Sept., Nov.) 
This method of listing the elements of a set is called Roster method. 
Alternatively if we state the rule or conditions under which a given 


object belong to a set is called ‘Rule method’. 
It is possible to convert roster method to rule method or vice 


versa. 
Thus Е = (Set of even numbers between 2 to 12} 
C — (Set of months having 30 days only in the calender} 
Symbol Є is used to indicate that an element belongs to the set, c 
indicates that an element does not belong to a set. 
Consider the set E — (2, 4, 6, 8, 10, 12) 
8 Е E meaning that “8” belongs to E 
IE: ‘1’ does not belong to Е. 
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Finite set is a set wherein all the elements in it can be counted 
and an infinite set is that which possess infinitely many elements. 
A set which contain no elements is called a null or empty set and 
is denoted by ф or ( }. 

If every element of set ‘A’ belongs to the set ‘B’, then A is called 


the subset of В and denoted by A C B. 


Ex.1: Form as many subsets as possible by using the elements 
of the set S — (1, 2, 6) 
А = (1, 2, 6} all the elements are used 
A, = (1, 2, А, = {2, 6}, As = (1, 6} two elements are used 
А, = {1}, As = {2}, Ав = {6} one element is used 
A,={} no element is used 
All these are subsets of S. Thus А C 5, А, С S, 4, 5, 4, С S, 
А, C S, A С S, 4, C 5,4, С 5. 
The sets 4 and S are same which means that every set is a subset 
of itself. A null or empty set is a subset of all sets. 


" Equal and equivalent sets 
Two sets are said to be equal when the elements in both sets are 


equal and same. 
Consider the two sets 
E, — (1, 3, 5) and E, — (3, 1, 5) 
Here all the elements of E, are also elements of Е i.e. E; С Ey 
Further all the elemets of E, are also elements of E, ће. Ey С E, so 


that Е, = E; 
If the number of elements is two ог more sets are same, it is 


called an equivalent set. 
Some times using the elements of a set two or more different sets 


can be formed. 
Let U = (Set of all persons who voted in the last election) 
Р = (Set of passive voters} 
А = (Set of active voters) 


U is called the universal set for P and A. From the main set 
elements are used to form other sets. 


PROBABILITY 3 


If no common elements exist in two sets E, and E, they are called 
disjoint sets. The complement of set E, denoted by E or E' is defined 
as the set of all elements belonging to the universal set but not for 
the set Е. 


Venn diagram 

The diagrammatic representation of the relationship that exists 
between sets is known as Venn Diagram due to John Venn 1834-83 
(an English logician). The universal set is represented by a rectangle, 
and the sets are denoted by small circles or oval type and the ele- 
ments of a set are indicated on the diagram. 

Let U — (1,3, 5, 7...... 27) — universal set and E = (1, 5, 11, 21}. 

Hence E c U. 


Disjoint sets 


Let U — (a, b, c, d...... 0) 
P — (a, c, e) 
Q = (b. d, f} 


Here P c U and Q c U. But P and Q are disjoint sets. 
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Complement of a set ‘A’. 
Let U = (3, 6, 9, 12, 15, 18, 21, 24, 27, 30) 
and A= (3, 12, 21} 


The portion of ‘U’ excluding ‘A’ is the complement of A w.r.t и, 
which is denoted by A’. 


A'w.r.t U = (6, 9, 15, 18, 24, 27, 30) 


— etm — 


a> 


Is Ua (1,2, 3) 4, 5.6.7} 
A — (1, 2, 3, 4, 5, 6, 7} 
в = {2, 4, 6,8) 


| 
"Union of two or more sets | 


PROBABILITY 5 


Note that the elements of U are either in 4 or in B or in both and 
further А C U & and B C U and U contains the total elements of 
A and B. Hence Union of two sets is a set which contains elements 
which belong either of the two sets or both ie. 4 U B. 


Difference of two sets А and B 
Let A —(1, 2, 3} 
B={2,3 4 
The set of (4 — B) is the set of elements of А that are not in B, so 
that (A — В) = 1 and (B — А) = 4. 


< 


Fig. 5. 


B 


Intersection of two sets 
If U={a, b, c, d, e, f, 8, h} 
М = (a, b, c, d, е} 
N — (e, d; g, h) 
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The set which contains the elements that are common to both sets 
A and Bis called the intersection of sets А and B denoted by 4 N B. 


12: PERMUTATIONS AND COMBINATIONS 

The arrangement in a definite order of a finite number of distin- 
guishable objects is known as permutation i.e. the arrangement of 
various objects with a particular order. Thus the alphabets 4 and В 
may be considered as objects and this can be arranged in two possi- 
ble ways namely АВ and ВА. When 3 letters are considered, the 
possible ways of arrangement are ABC, А C B, BA C, ВСА, СА В, 
C B A which means 6 possible ways. Similarly it can be seen that 
4 letters can be arranged in 24 ways and five letters in 120 ways 
and so on. 


Thus by using 2 letters, 2 possible arrangements are possible 
3letters, 6 


4. 3 24 ж » 

Stm 120 y x 

бу, 09720 D 
and so on. 


This can be generalised. If there are ‘п’ letters, the 1st place in the 
ordered arrangement is in n ways and the next place can be filled 
in (n — 1) ways since only (n — 1) letters are available, the third in 
(n — 2) ways, the fourth in (n — 3) ways and so on and finally the 
last one is in only one way being the last letter after picking the 
remaining. Hence the total number of ways is n (n — 1) (n — 2) 
(n — 3)...... 4.3.2.1 and this is denoted by n! or p (n factorial) 


For 2letters the arrangement is 2.1 or 2!—2 


ЭГЧ Fr 3-2-1 or 3!=6 
4-8 Ж 4.3.2.1 or 4!—24 
Б? >» " 5-4-3-2-1 ог 51— 120 
Pu. „ 6.5-4.3.2.1 ог 61— 720 


and so оп. 


> => < Se 


=> 


PROBABILITY 7 


Thus the number of permutations of п objects taken ‘r’ at a time 
denoted by „Pr is 
һР„= n(n— 1) (n— 2) (n —3) d (n—r41) 
ДАНА 
(8-1)! 
If ‘n’ objects are taken ‘m at a time, then 
n?n = n (n — 1) (n 2)...3.2.1 =n ! 
Illustration: The number of permutations of the letters а, B, у, 9 
taken two at a time is 
422 = 4 x 3 = 12 ways 
and they are «В, «ү, ad 
By, BS, Ba 
Yò, үс, УВ 
Sa, ба, 88 


The number of permutations of 7 objects comprising of groups in 


h 5 Е | 
which п, are alike, п, are alike is , Where n = My + по. 
21 


n,!ns 

The number of permutations of the letters of a word say 'SIGNIFI- 
27 12! Ёс 

CANCE' is түұттүггтітітг!= 19958400 


the letters are repeated as S — 1 time 


1 — 3 times 
с 1time 
N 14ше 
Е 2times 
с 2times 
е— 1 time 


Combinations: If it is desired to find the number of ways of 
selection of number of objects, not taking cognisance of the order 
of arrangement, the combination comes to the picture. Thus the 
combination of 977 objects taken ‘r’ at a time is nothing but selection 
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of “г? objects out of ‘n’ objects. The number of combinations that 
can be made out of ‘n’ objects taken ‘г’ at a time denoted by „с, is 


n! (n —n 1) (а — 2)...(n —r + 1) 
ri(n—r)! ni 


nCr 


It can be seen that the permutation of each combination of ‘r’ 
things amongst themselves, leads to all possible permutations of 97 
things ‘r’ at a time and each combination brings in г | permutations 
and hence 


Рас; = аР, 
= 1 
(ar! 
š n! р, 
.. = атут ОГ 4C, = ЕГ 
The number of ways of selection of 3 out 8 people is 
81 
8С, = 81512 56. 
13. BINOMIAL THEOREM 
For any positive integer п 
(P+ 9" = р" + sep" + sco p?-* 4° 
+ ns pP" q? -...... T4" (1) 
The coefficients „су, 55, nCs...... are known as Binominal Coefficients 


and the right hand side has (n-- 1) terms. This is known as ths 
Binomial theorem. 


Proof: Forn=2 
(р + 9)? = р? + „са ра + сар? 4 
=P* + 2р4 + 9° 


hence the theorem is true for п = 2. 
For n=3 
(p + 9)? = P? + sc P*4 + зс» P d? + 405 p? 4% 
=P + 3р°4-+- 3рд° + 4° 


hence the theorem is true for п = 3and so on. 


e 


PROBABILITY 9 


Multiply both sides in the equation (1) by (р + 4) 
(p + gy =[р" + sc p" g + пса p" 4 + „са 17? 4% +...+ 9"] 
(р +4) 
= р" 4 „су р" 4 + „со р" Ф + пса р“ 9° --...+ pq" 
+ ар" + аср" 4 + асар" 4° ++ gn 
= р" - pages Р" + n4aCo D" 9° + ntaa "7 9% 
F naea D"? gh + uuo qn 
since nr + nCr—y = п+16г 


Hence the theorem is proved by induction; it is seen that the theorem 
is true for (п + 1) and so for (и + 2) and so on, but the theorem 15 
true for n — 2,3.... Hence the theorem is true for all positive in- 
tegers “л”. 

The expansion (р +q)" can also be written in the following 
manner: 


(р + 9)" = Di aer p? q' 


The binominal coefficients may also be obtained by the technique of 
Pascal's pyramid or triangle (due to Blaise Pascal 1623-62) 


n Coefficients 

0 1 

1 1 1 

2 1 2 1 

3 1 3 3 1 

4 1 4 6 4 1 

5 1 5 10 10 5 1 

6 1 6 15 20 15 6 1 

7 1 7 21 35 35 21 7 1 

8 1 8 28 56 70 56 28 8 1 


The coefficients are obtained by successive additions. 
Thus (p+ q) = р“ + 4p°q + бр"а + 4ра? + а“ 
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and (p+ 4) = p? + 8p'q + 28p*q* + 56рэд° -+ T0p'q* + 56р“4 
+ 28p?q* + Spq* + 45 


1.4. EXPONENTIAL SERIES 


is known as an exponential series. 


1 1 1 1 
Thus, e—lctqtactatat ЈА ES 


c 2.11828... which is the base of natural logarithms. 
By putting x = —x 


Si lo me 
art esl aay о ар on fe 


Further it can also be shown that 
Lim [ M =] -— 
поо п 


1.5. PROBABILITY THEORY 

In practice it is quite common to come across statements like 
‘good probability’ of winning a match or very low or chance of 
occurring of an event like India has very low chance of winning world 
Wimbledon 1988 tennis and so оп. Usually the various methods of 
arriving at this probability value of say 20%, 50% etc. are arrived 
by guess work or estimation or by mathematical methods. Though 
the statements are by probability statements, in most of the situa- 
tions in practical areas, the methods are deliberated under ideal 
situations of games of chance. The following are some of the terms 
used in practice for defining probability as also the rules governing 
them. 


Sample Space 

The set of points representing the possible outcomes of an experi- 
ment is called Sample Space for the experiment. The collection of 
all possible quantities of some type or a complete set of phenomena 
is known as ‘Space’ and when we deal with problems dealing with 
selection or picking of Samples from this Space it is known as 
Sample Space. 


— => 


— UE 


PROBABILITY 11 


The Sample Space (S) for throwing two fair dice is 


(2,1) 2,2). (2,3) (2,4) (2,5) (2, 6) 
(3,1) (3,2) (3,3) (3,4) (3,5) (3,6) 
(4,1) (4,2) (4,3) (4,4) (4,5) (4,6) 
(5, 1) (5,2) (5,3) (5,4) (5,5) (5,6) 
(6, 1) (6,2) (6,3) (6,4) (6,5) (6, 6) | 


(1,1) (1,2) (,3 (1,4) (1,5) (1,6) | 
| 
| 


Hence the Sample Space for throwing two fair dice consists of 36 
points. 

From this it can be observed that Sample Space is the description 
of all possible outcomes of the experiment. The Sample is used to 
denote the outcome of the experiment which has been observed. . 


Events: Let us consider as to what exactly and how Sample 
Space is distinguished with respect to an event. In an experiment of 
throwinga die, the possible outcomes are 1, 2, 3...6 and in tossing of 
coin results either in a Head or Tail. The outcome of these experi- 
ments result in what are known as events. Thus events are nothing 
buta set of descriptions. Hence an event is defined as any Subset 
of Sample Space S. 


Complementary event: For any event E, itis usually desired to 
know the non-occurrence of the event Е. Thus for any event E, there 
exists an event Æ’ or E which is called the complement of Е which is 
the event that E does not occur and consists of all descriptions in 
the Sample Space S which are not in £. 

Suppose *E' is the event that the throw of a die results in odd 
numbers namely 1, 3 and 5. The complementary event Е” is nothing 
but the throw resulting in even numbers namely 2, 4 and 6. 


Mutually Exclusive Events 

Iftwo events E, and E, possess the property that the occurrence 
of one prevents the occurrence of the other, they are called mutually 
exclusive events. If E, and E, are the events of obtaining a total 
of 6 and 10 in tossing a pair of dice, these events are mutually ex- 
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clusive since there are no common outcomes which correspond to 
the occurrence of both E, and E,, that is the two events possess no 
points in common in the Sample Space. This can be described by 
set theory also. An impossible event is the event that contains no 
descriptions and hence cannot occur, which is nothing but empty or 
null set. Any events E, and E, that cannot occur simultancously 
so that their intersection E, E, is an impossible event are said to 
be mutually exclusive or disjoint events. 


Independent Events 

If the events E, and E, are such that the occurrence of say Е, 
does not depend on whether E, occurs or not then Е, is said to be 
independent of E,. While tossing a coin several times, the event of 
getting a tail in the first toss is independent of getting a tail in the 
second and successive tosses. 


Probability 

The word probability used in common practice like the chance of 
a person aged 60 years of surviving for 5 more years, the probability 
of favourable monsoon for 1988, probability of an individual's 
victory in elections, chance of a cricket team winning a match, 
chance of winning a lottery etc. lead to uncertain and doubtful pro- 
positions like telling ‘fifty-fifty chance ог I bet 1 to 250 that he will 
lose elections or that the chance of winning a match is nearly certain 
and so on. By applying some elementary mathematics, probability 
is studied to throw some light about the uncertainties and present 


in numerical form of these probabilities. 


1.6: DEFINITION 3 | 
If an event ‘E’ can happen in стр cases out of ‘n’ possible and 


equally likely cases, then the probability of the event ‘E’ is defined 
as the ratio of т to n, ће. 


Hence to determine the probability of an event Е it is necessary to 
know the possible as well as for those in which the events will have 
occurred or known as favourable cases. 

The probability can be obtained in this form in all the situations 
Wherein the possible and equally likely cases can be counted, like 


1 
у 


PROBABILITY 13 


in the experiment of tossing of a coin or die. In the former there are 
two possible outcomes namely the head and the tail and these are 
equally likely too so that the probability of obtaining a head is à 
and in the latter case there are 6 possible and equally likely cases so 
that the probability of obtaining say “2* ist. In these and similar 
such examples of simple games of chance the total possible and 
equally likely cases can easily be enumerated. Incertain other situa- 
tions it may not be possible to enumerate the cases which are con- 
sidered to be equally likely or equally probable since in some cases 
the total possible events may not be equally likely as in the case of 
smokers and non-smokers afflicted with lung cancer. The affliction 
of cancer amongst smokers and non-smokers are not equally likely 
and using a method it is desired to know the relative likelihood in 
these two cases and hence another form is advocated wherein the 
term emphirical or estimated probability is used, wherein the pro- 
bability of an event is taken as the relative frequency of the event 
when the number of observations are very very large. 
Thus РКЕ) = Lim Z 
no П 

Consider an example of the probability of obtaining а tail in a toss 
of coin, which is $ as per the former definition. Supposing if 1000 


tosses result in 531 tails, then the relative frequency of tails is $us 


or 0.531 and if another 2000 tosses result in 982 tails then the 
4 Н . 531 + 982 

relative frequency in 3000 tosses is —2000 = 0.5026 and by con- 
tinuing this process, a closer value of the probability of the event 
‘tail’ (0.5) is obtained. 

The former definition of probability is also known as ‘a priori’ or 
classical or mathematical probability, whereas the latter is known 
as Statistical or emphirical probability. 


17. RULES OF PROBABILITY 

1. The probability of an event p(E) always lies between 0 and 1 
0<р(Е)<1 

Thus if p(E) = 0, the event is certain not to occur or impossible and 

if p(E) — 1, the event is certain to occur. 


If the event is impossible, it means that there are no possibilities 
for a particular event to occur among all the total possibilities. Thus 
» 
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the probability of I being selected as a captain for the Indian hockey 
team is zero, since Тат neither competent nor capableas I have never 
played that game and also because of my crossing the age of playing 
that game. On the other hand, the probability of winning all the 
prizes in a given lottery draw is 1, if all the tickets in all the series 
are purchased and thereby the event includes the entire Sample 
Space. 

2. The probability of the occurrence of a particular event not 
occuring is equal to 1 minus the probability that it will occur (com- 
plement of an event) 

P(E") = 1— р(Е) 
Since 


т 
WE) =" 


p(not E)="—" — 1 — T -1— (Е) 


The probability ‘1’ represents the entire group of possibilities for a 
particular event, the rest of the possibilities are left out. 

Thus, the probability of getting a tail in a throw of a coin is 
nothing but 1 — pr (Head) = 1 — } = 2. 


1.8. EXAMPLES 
]. Determinethe probability of the appearance of an even number 
in a single toss of a fair dice. 

Here there are 6 possible equally likely cases and the number of 
favourable cases to the occurrence of the event namely even numbers 


is 3 (2, 4 or 6) hence p= 5. 


2. What is the probability of obtaining a sum 7 та single toss 
of a pair of dice? 

The total number of possible and equally likely cases are 36 and 
the sum 7 can occur in the following manner 


Ist die 2nd die 


долоог. 
одн RAD 


PROBABILITY 15 


i.e. thereare 6 ways of obtaining the sum 7 in asingle throw of a pair 
: бы 
of dice so that р = 36^ € 
3. Two coins are tossed. Find the probability that: 
(a) both coins show heads (b) both coins show tails (c) one 
coin shows head and the other tail and (d) both show the same face. 
The total possible ways or the Sample Space corresponding to this 
experiment 
S = (HH, HT, TH, ТТ} 
(а) РКА)= p,(both heads) = р(НН) = i 
(b) p,(B) =pr(both tails) = p(TT) = 1 
4 2 ла]. 
(c) p(C) = ркопе head another tail) = p(HT, TH) = 12 
1 


2 
(d) рр) = рибоћ show same face) = p(HH, TT) = 41 = 5 


4. An unbiased die is tossed. Find the probability of (a) a prime 

number turning up and (b) square of a natural number coming up. 

(a) There are 6 possible equally likely cases and there are 3 ways 
of obtaining a prime number Е = (2, 3, 5} 


: Sl 
Hence p(prime number) — Я 
(b) The event of the square of a natural number coming up 


p(natural namber) — 5 - 1 


since 1, 4 are favourabie to the event. 

5. Determine the probability р(Е) for the event of a head appear- 
ing in the next toss of a coin if out of 200 tosses 114 were tails. 

The number of heads obtained in 200 tosses in 200 — 114 — 86 
and hence the probability of a head is nothing but the relative fre- 

86 e 4 

quency namely 200^ 0.43 which is nothing but the emphirical or 
estimated probability. 

6. 40 persons were questioned about their voting. 15 had voted 
for Congress I, 14 opposition and the rest did not vote. What is the 
probability that an individual selected voted for Congress I? opposi- 
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tion? did not vote? If 3 persons are net list all the possibilities. 


Pr(Congress) = == = 0.375 
Pr(opposition) — 207 0.35 
Pr(did not vote) = 2077 0.275 


If three persons are picked, all the P are 
Congress I 3 2 2 1 1 1 0 0 0 0 
opposition 0 1 0 1 2 0 2 1 3 0 
did not vote 0 0 1 1 0 2 1 2 0 3 

7. А bag contains 3 white, 4 green and 5 pink balls. What is 
the probability that two balls drawn are green and pink? 

The total number of balls in a bag = 3 + 4+ 5 = 12. Two balls 


can be selected out of 12 balls in з,с ways = шиль — 66. Out of 4 


green balls onecan be selected in ас, ways and the pink in с, ways 
so that the total number of favourable cases is ,c, "5С; = 20. 
Hence probability of one green and a pink ball 
Tuer eom 
> Ane 66133 
8. Two ordinary dice are thrown. What is the probability that 
(i) the product of the numbers on their upper most faces lies between 
7 and 13 (ii) Sum of the numbers on their upper most faces of the 
dice lies between 7 and 13. 
The total possible cases — 36 
(i) the number of favourable cases of the product of the numbers 
to be between 7 and 13 are 


die 1 die 2 


i.e, 9. Hence p — 


эь 
| 


юсю чоо о AN 
Otot to! чо A NA 


PROBABILITY 17 


А 4 
(1) the number of favourable cases of the Sum of the numbers 
to be between 7 and 13 are 


die 1 die 2 

2 6 6 4 

6 2 4 6 

3 5 6 6 i.e. 13. Hence p = 18 
36 

5 3 6: 15 

4 5 5 6 

5 4 

6 3 

3 6 


9. From a pack of 52 cards, three are drawn at random. Find 
the chance that they are a king, a queen and an ace. 

Three cards can be selected out of 52 cards in „аса Ways. A king, a 
queen and an ace can be associated with ,0,-40,40, = number of 
favourable cases since there are 4 kings, 4 queens and 4 aces in a 
pack of cards and one in each being 4,0,*4€1*461 
Hence the probability — жар UN = uA 5 =з 


10. A bag contains 7 white and 6 red balls. Two balls are drawn 
at random. Find the probability that (i) they are of same colour 
and (ii) they are of different colours. 

The number of ways of drawing two balls out of 13 (7w + 6R) is 

13.12 


arbe ds 


13°2 = 


(i) 2 balls of the same colour can be drawn at random їп ;c2440; 
"76 86:5 
ways, ће. 5-ү dores 21+ 15 = 36. 
Hence the probability that the 2 balls are of same colour is 


36 6 


OS 13 
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(1) The probability that they are of different colours is 


6 7 
1 — p(same colour) = 1 — B B 

11. What is the probability that a leap year selected at random 
will contain 53 Saturdays? 

Since a leap year contains 366 days, 52 completed and two other 
extra days are available. 

These two days can be (1) Sunday; Monday (2) Monday, Tuesday 
(3) Tuesday, Wednesday (4) Wednesday, Thursday (5) Thursday, 
Friday (6) Friday, Saturday and (7) Saturday, Sunday. 

Hence the total possibilities = 7, that is the extra two days in a 
year and the arrangement thereof are as shown above. 

The number of favourable cases = 2, since it should be a Saturday 
and only two possibilities are seen. 


NIN 


Hence the desired probability = 


12. Determine the probability ‘p’ of a non defective bolt being 
found if out of 600 bolts already examined 12 were defective. Proba- 


A E 12 
bility of defective bolt — 600 
Hence, the probability of a non defective bolt 
У 12 588 
=1— :6600 600 = 0.98 


1.9. CONDITIONAL PROBABILITY 

For the purpose of illustration, consider 2 mutually exclusive 
events say E, and E,. Suppose it is desired to know whether E, will 
occur subject to the condition that E, is known to have occurred or 
is certain to occur. Let us consider the experimental outcomes of 
event E, and the Sample Space is nothing but the simple events 
comprising £j. 

In figure 6 note that in the region E, points are observed and are 
also found inside the region Е, also, that is the overlapping parts of 
E, and Е,. If п(Е,) denote the number of points inside the region E, 
and n(£, and Е.) denote the number of points inside both the regions 
E, and E,. Hence the probability of the point E, will occur if the 


је с = ьп ал ЗИД о > 
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po ЗЛ ы а 


SAMPLE SPACE’ 5' 


Fig. 6. 


Sample space are restricted to be the set of points inside the region 
E, is r 
n(E, and Е.) 
n(E) ` 
This, in other words is nothing but the probability that E, will 
occur subject to the condition that E; will occur denoted by P(E,/ Ei): 
Tn the figure above, the number of points comprising the event E, 
is 21, ће. n(Z,) = 21. The number of points that lie inside E, and 
also inside E, is 6 i.e. n(E, and E;) = 6 so that 
Om. 92. 
P(E, Ej) = 217 7 
Definition; If 4 and В are two events in a Sample Space ‘S’ such 
that P(B) > 0, then the conditional probability of the event А, given 
that the event В has happened (denoted by Р(А/В)) is 
Р(4, B) 
P(B) 
Similarly, the conditional probability of В given that А has already 
occurred (denoted by P(B/A) such that P(A) > 0 is 


Р(В/А) = Р es | 


P(A/B) = 


Example: A box contains 5 red and 4 black balls. Two balls 
drawn without replacement. What is the probability that the sec 
ball is red if it is known that the first ball is г v 


=%-0.%-3 


БУРЭН 
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Let В, the event that the first ball is red and 
A, the event that the second ball is also red 


Hence Р(А, В) is the probability that both the first and the second 
are red. 2 balls can be selected out 9 in yc, ways and the sample space 


\ 


" 5 > ar 1 
thus contains ус, points each with a probability of "as 


The number of ways of getting two red balls is ьс, and hence 


Са 


5 
E. Bone? 
Don Ege т 


P(B) is the probability of the Ist ball drawn being red which is 


Note that, this can be computed directly also. If the 1st ballis red 
then 4 red and 4 black balls are left in the box so that 


КАЈВ) = + =} 


1.10. TWO BASIC THEOREMS OF PROBABILITY 


Addition rule 
Normally the application of probability does not deal with just 
one event but with more number of related events. Let the Sample 
Space ‘S’ comprise events A and B, that is an experiment with 
events A and В. In this experiment both 4 and В may occur and 
for this joint event (4 and В) the probability is denoted by P(A 
and В). Further one of the events A or В may occur. 
This event (A or B) and its probability (denoted by P(A or B)) 
may also be obtained, which means that A occurs but Bmay not, B 
occurs but A does not occur or if both A and B occur. Here A or B 
means either one, or the other or both. 
Tf the events A and В are mutually exclusive, i.e. the occurence of 
= the one prevents the occurrence of the other, and if n(A) is the 
^ amber of points lying inside the region A and n(B), the number of 
points lying inside the region B, then the total number of points 


+ А 
— 


Ac 


Tm 


Pow 


E Wee e 
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associated with the occurrence of either А or В is the sum of these 
two numbers. 
Thus 


P{A or В} = шег 209) =" + x» P(A) + P(B) 


Where ‘п’ is the number of Sample points in the Sample Space 


Fig. 7. 


Sample Space with two mutually exclusive events А and В. Inthe 
above diagram the total number of Sample points is 37 (n). © 


The number of points in A is 8 [1(4)] 


The number of points in В is 10 [n(B)] 


8+10 8 10 
P(A ог В) = + 


If A and В are any two events in 557, the Sample Space, then whether 
the events А and В, be mutually exclusive or not can be obtained. 
Consider the case if the two events 4 and B are not mutually ex- 
clusive, then the probability of happening of at least one of them is 


P(A U В) =Р(А) + РСВ) — P(A П B) 


Proof: Let А and B are two events which are not mutually 
exclusive. The event “4? can occur alone or can occur along with B. 
In other words event ‘4’ can occur in two mutually exclusive forms 
AB and AB. С 

Therefore 


P(A) = P(AB) + P(A Bon: 
m uq: eel 


о aia. ~ ~ 
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SAMPLE SPACE 


Fig. 8. 
In the same form event *B' can occur in two mutually exclusive 
forms AB and АВ. 
Hence P(B) = P(AB) + P(AB) 
Thus P(A) + P(B) = [Р(АВ) + P(AB) + P(AB)] + P(AB) 
=[P(A U В)]+ P(AB) 
P(A U B) = P(A) + P(B) — Р(А П B) 
ће. P(A ог В) = Р(А) + Р(В) — P(A, B) 


The probability that the event А or event B ог both occur is equal 
to the Sum of the probability that the event 4 occurs and the pro- 


bability that event B occurs minus the probability that both the 
events 4 and B occurs. 


Example: Find the probability of drawing a king or a queen in 
a single draw from a deck of cards. 


P(king or queen) = P(king) + P(queen) — = ыг 5 = 2 


Since both queen and а king cannot be drawn іп a single draw, 
these events are mutually exclusive and hence Р(4, B) — 0. 

On the other hand, if we want to calculate the probability of draw- 
ing a king or a spade or both, then the events А (drawing a king) 
and B (drawing a spade) are not mutually exclusive, since the king 
of spades can be drawn. 


с. > LS zs 
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Hence P(A ог В) = P(A) + P(B) — P(A, B) 
1 16 4 


a 418 
“452552 53" 52 — 13 
The addition rule can be generalised for more than two subsets 
also. 


P(A or Вог Сог D) = P(A) + P(B) + P(C) + Р(Р) 
if the events are mutually exclusive 


and P(A or В or С) = P(A) + P(B) + P(C) — P(A, B) 
- — P(B, C) — P(A, С) + P(A, B, C) 


if the events A, B and C are not mutually exclusive 


Multiplication rule 

A method which simplifies the probabilities of compound events 
is being considered. À compound event is an event which comprise 
of two or more single events, like a die is tossed twice or thrice. 

If A and B are the two events in a Sample Space then the event 
that ‘both A and B occur’ denoted by А В, then 


P(AB) = P(A) РАВЈА) 
i.e. the probability that both the two events will occur is equal to 
the product of the probability that the first event will occur to the 
conditional probability that the second event will occur when the 
first event has occurred. 


P(AB) = P(B) P(A/B) 


Proof: Let ‘n’ be the total number of mutually exclusive, equally 
likely and exhaustive number of cases of which ‘m’ be favourable to 
the event ‘B’. 

The cases favourable to both the events 4 and В are included in 
the ‘m’ cases favourable to ‘B’ (say ту) then the probability Р(АВ) 
that both tbe events А and В will happen is 


It has been observed that = is nothing but P(B), whereas the ratio ~ 
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= represents the conditional probability P(A/B) of А, assuming 


that * В? has occurred. 


Hence P(AB) = P(B):P(A/B) 
If the events are independent Р(А/В) = P(A) 
so that P(AB) = P(A)-P(B) 


The conditional probability of the event ‘B’, given the event A 
denoted by P(B/A) is defined by 


Р(В/А) = ET 
Hence P(AB) = P(A) P(B/A) 


In the case of independent events 
P(AB) = P(A) P(B) 
For three events A, B, C 
P(ABC) = РКА) P(B/A) P(C/AB) 
Illustration: If a box contains 4 white and 3 black balls and let 
A be the event that ‘first ball drawn is black’ and B, the event that 
the ‘second ball drawn is black’ and if the balls are not replaced 


after being drawn. Find the probability that both balls drawn are 
black. 


P(AB) = P(A) P(B/A). Since the events are dependent 
Р(А) = 3 = probability that the first ball drawnis black 


1 Probability that the second ball drawn is black 


Р(ВЈА) = 3 given that the first ball drawn is black. 


Consider independent events 

If the probability that ‘A’ will be alive for 10 years is 0.75, the pro- 
bability that ‘B’ will be alive for 10 years is 0.40 and that of “с? 
being 0.25, what is the probability that all the three will be alive for 
10 years? 


P(ABC) = P(A) P(B) P(C) у 


яд эээ тла зс 
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= (0.75) (0.40) (0.25) — 0.075 since the events are independent 


1.11. SOLVED EXAMPLES 
1. A card is drawn from a pack of cards. What is the probability 
that the card drawn is a queen? What is the probability that the 
card drawn is a ‘heart’? 

As there are 52 cards in a pack, of which 4 are queens, the num- 
ber of possible cases is 52 and the number of cases favourable to the 
4 1 
52^ 13 
The probability of drawing a ‘heart’ is = -i Since there are 52 


event of drawing a ‘queen’ is 4. Hence the probability = => 


possible cases and 13 favourable cases. 

2. A person addresses 7 letters, with their addresses in 7 envelo- 
pes and asks the servant who is illiterate to put the letters in their 
respective envelopes to post the same. What is the probability that 
the servant puts the letters in wrong envelopes? 

7| letters can be put in 7 envelopes in 7! ways or 5040 possible 
ways. The number of ways of putting the letters to envelopes is 1. 


1 


Непсе Р(А) = 5040 


Hence the probability of the event that is the letters and envelopes 
are not put correctly is 


1 _ 5039 
1-24) = 4 = P(A) = 1 — 5040 = 5040 


3. Sixcards are drawn at random from a deck of 52 cards. What 
is the probability that 3 will be red and 3 black? 

Six cards can be drawn from 52 cards in „с ways. Out of 52 
cards, 26 are black and 26 are red cards and 3 can be selected in 


эвСз Ways 
Hence the probability that 3 will be black and 3 red = Mats 
5276 


26.25.24 26.25.24 6.5.4.3.2.1 39000 


321 321 5215150494847 ~ 117453 09320 


4. A room has 3light holders. From a collection of 10 light 
bulbs of which 6 are no good, a person selects 3 at random and puts 
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them in the holders. What is the probability that he will have light? 
Let us determine the probability of not getting light. 
The possible ways of selecting 3 out 10 bulbs in усу. Three bad 
bulbs can be selected out of the 6 in (сз ways. 


am. 654 321 = 
Hence p(4) = wcs 3.2.1 109.8 


The probability of having light = p(4) = 1 — p(A) 


5. Write the Sample Spaces for the following random experi- 
ments: 
(i) Throw of 2 fair dice : (PUC-oct. 1977) 


The Sample Space “5” for throwing two fair dice is 


f (1,1) (2,2) (1,3) (1,4) (1,5) (1,6) 1 


| 
Serie ч Te P Че 


н ыйы» | 
| i 
L (6,1) (62 (6,3) (64 (6,5) (6,6) 


(ii) A coin and a die are tossed 
The Samp'e Space corresponding to this experiment is 


$ = (H, 1) (Н, 2) (H, 3) (H, 4) (H, 5) (H, 6) 
(Т, 1) (T, 2) (T, 3) (Т, 4) (T, 5) (Т, 6) 
(iii) Three coins are tossed 
S —(HHH, HHT, HTH, THH, HTT, THT, TTH, TTT) 
(iv) There are 4 different ball point pens and three different note 
books. A student picks up one ball point pen and one note 
book. 
Let the 4 ball point pens be denoted Бу 5,, ba, ba, and b, and the 
3 note books be denoted by ту, n, and пу. 
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__ бь m), (bu п»), (61, Па), (ba, пу), (bas пь), (Bo, пу) 
Sample Spacery ter т), (ba, n), (бы т), (Bas т), (Ва, по), (bas My) 


6. А card is picked randomly from a deck of 52 cards. Let A be 
the event corresponding to ‘queen’, B corresponding to ‘ten’, C, 
corresponding to ‘spade’ and D corresponding to ‘picture card’ 
being chosen. Write down the elements belonging to (a) AU B (b) 
AUBUC (с) AND (d) ВПС (e) CND and hence obtain the pro- 
babilities. 


Неге А = диееп, В —10, С = Spade, D = picture card 


(a) AUB (meaning either A or B) 


Spade queen, Diamond queen, Heart queen, Club queen EAE, 
© 13582 ten, Diamondten,  Heartten, Club ten 


Hence P(AU B) — 5 


(b) AUBUC 
Diamond queen, Heart queen, Club queen, Spade queen 
= Diamond ten, Heart ten, Club ten, Spade ten = 19 
Spade: A, 2, 3, 4, 5, 6, 7, 8, 9, J, К. 
19 
Hence P(AUBUC) = 33 


(с) AND (meaning 4 and D) 
— (Spade queen, Diamond queen, Heart queen, Club queen) — 4 


2 


P(AND) = 55 


(d) ВПС = (Spade 10} = 1 
1 
PNG) = 5 
(e) СПР = (Spade J, Spade О, Spade К} = 3 


3 
P(CND) = 52 
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7. 30 tickets are marked with numbers 1 to 30. If one is drawn 
at random, find the probability that it isa multiple of 3 or 5. 
Let 4 be the multiples of 3 and В, multiples of 5. 


The multiples of 3 are 3, 6, 9, 12, 15, 18, 21, 24, 27 and 30 — 10 


х Jareno. 10. 15,20,25, 304 = 6—2 —4 
(since 15 and 30 are common in both cases) 
14 7 


probability (4 or В) = 30 = 15 
8. A bag contains 8 black, 3 red and 9 white balls. If 3 balls are 
drawn at random, find the probability that (a) all are black (b) 2 are 
black and 1 is white and (c) 1 of each colour 
Total number of balls in the bag 


= $ black + 3 гей + 9 white = 20 
3 balls can be drawn at random in agCg Ways. 


(a) probability that all the 3 аге black = gc, 


8.7.6 3241 14 
Ва REUS ET 
PB) = à = 321 20198 = 285 


(b) 2are black and 1 is white 


21 
Prob (2 black and 1 whitey = 279: _ 8-7 g 321 21 


2507113127177 20.191875:9 


(c) 1 of each colour 


c c с 
Prob (1 black, 1 red and 1 white) — Bebe oi 


20°3 
8.3.9 _ 18 
= 3019.18 ^? = 95 


9. If the letters of the word ‘MATRICES’ are arranged at ran- 
dom, what is the chance that there will be exactly 3 letters between 
“М? and ‘A’, 

2 letters M and A can occupy in Р, positions (since there are 8 
letters in the word ‘Matrices’) = 8.7 = 56 
The number of ways in which there will be 3 letters between М and 


A are; 7 
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M inthe 15 position 4 is in the 5th 
M = 2па 5 A эз 
Ma >, зга > А 53 
M у 4th " A Ё: 
А = Ist 5 M 
Ма ЖОЛКУ! M 

d ats 3rd " М +: 
А. Mh » M 


со -1 OQ «л бо м © 


- 8.14 
Hence the probability = 367 T 

10. From a set of raffle tickets numbered 1 to 100, three are drawn 
at random. What is the probability that all are odd? 

Three can be drawn out of 100 raffle tickets in үс ways. The total 
odd numbered tickets are 50 and 3 can be selected out 50 in „оса 
ways 

Hence the probability that all the 3 tickets are odd numbered 

omg _ 50.49.48 3.21 4 
Cw 3.21  100.99.98 33 


1l. А card is drawn from a deck of cards. What is the proba- 
bility that it is an ace or king? 
Probability (Ace or king) = p, (ace) + p, (king) 


Дэг MAN 
552 52 -1а 


12. The probabilities of 3 students Е, E, and E, solving the 


problem of ‘probability’ in the examination are 2 1 and $ respecti- 
vely. If all the three of them attempt independently what is the 
probability that the problem is solved? 


P(problem of probability being solved) = P(E, + E, + Es) 
= P(E,) + P(E.) + Р(Е;) — P(E; Ej) — P(E, Е) — P(E, Е, 
+ P(E, Е, Es) 
= P(E,) + Р(Е,) + P(E;) — P(E) P(E) — Р(Е,) P(E) 
= P(E,) Р(Е,) + Р(Е,) P(E;) Р(Е,) 
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De, ЕЕ ЧЫЗ 
2 а в 24 48 28 248 
EINE aS! 378906 


13. A ball is drawn at random from а box containing 10 red, 30 | 
white, 20 blue and 15 orange balls. 

Find the probability that it is (a) orange or red (b) not red or blue 
(c) not blue (d) white (e) red, white or blue. 

The total number of balls in the box are 30W + 208 + 10R 
+ 15 = 75 balls. 


(a) Probability (orange or red) 


B. 10 1 
= p(O + R) =р(0) + PR) = 75 3573 


(b) not тей ог blue. Consider the Ши of red or blue 
P(R + B) = PR) + P(B) = 76 + 75 =$ 
Hence p,(not red or blue) — 1 — pred or blue) 
=з 


(с) not blue: 


20 4 
Prob (blue) — 75 = 15 


4 1 


15 1 


[en 


Prob (not blue) — 1 — 


(d) white: 


30 


22 
75 275 


Prob (white) — 


(e) red, white or blue 
p,(red, white or blue) = p(red) + p(white) + p(blue) 


10 , 30 4 20 60 4 
: B B 78 175 5 
14. The probability that a 45 year old man will be alive for 


p———Á o сан 


x 
“= 


eet 
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another 10 years is 0.78 and the probability that his wife who is 38 
years will live for 10 more years is 0.81. What is the probability 
that this couple will be alive 10 years hence? 

Since these two are independent events 


P(E, E;) = Р(Еу P(E;) = (0.78) (0.81) = 0.6318 

15. One shot is fired from each of the three guns, and the pro- 
babilities of hitting the target by these guns are 0.71, 0.68 and 0.23 
respectively. Find the probability that (a) exactly one hit is registered 
and (b) at least two hits are registered. 

Let A, Band C be the events of the target being hit 


P(4)—0.71, P(B)— 0.68, P(C) = 0.23 
hence P(4)—1— 0.71 —0.29, P(B)= 0.32, P(C) = 0.77 
(a) Exactly one hit is registered 
Gun A hitting the target whereas B and C not hit (ABC) 
8723 55 Са „ (BCA) 
би 2 AandB „ (САВ) 
p(exactly one hit registered) = P(ABC) + Р(ВСА) + P(CAB) 
= P(A) P(B) P(C) + P(B) P(C) P(A) + P(C) P(A) P(B) 
= (0.71) (0.32) (0.77) + (0.68) (-77) (0.29) - (0.23) (0 29) (0.32) 
= 0.1749 + 0.1518 + 0.0213 
= 0.3480 


(b) Case 2: Two hits can be registered in the following ways and 
are mutually exclusive too. 


Аапа B hit butnotC (ALC) 
В аа С ,, орка (ВСА) 
CandA ,, RE (CAB) 
Prob (at least 2 hits registered) = Р(АВС) + Р(ВСА) + Р(САВ) 
= P(A) P(B) (РС) + P(B) P(C) P(A) + P(C) P(A) P(B) 
= (0.71) (0.68) (0.77) + (0.68) (0.23) (0.29) 
+ (0.23) (0.71) (0.32) 


= 0.3718 + 0.0454 + 0.0523 
= 0.4695 
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16. Ina lot of 15 items, 6 are defective. Suppose that one item 
is drawn at random and found to be defective. What is the prob- 
ability that the next item drawn at random will be defective? 

Let A represent the event that the Ist item drawn is defective 
and B represent the event that the 2nd item drawn is defective 


P(AB) = P(A) Р(В/А) 
Clearly P(A) = e and Р(В/А) = ü 
6 
15 


1 
Similarly, the probability that the third item drawn also is defective 
(C) is 
6 5,4 
P(ABC) = P(A) P(B/A) P(C/AB) = 15 14 13 - 91 
17. Contractors А, B and C are bidding on a contract and one 
of them have to get it. The probabilities of these 3 getting are 0.41, 
0.27 and 0.19 respectively. Suppose that, if 4 gets the contract he 
will choose ‘D’ as sub-contractor with probability 0.91, if B gets the 
contract he will choose ‘D’ with probability 0.42 and if C gets the 
contract he will choose D with probability 022. What is the pro- 
bability that D will get the subcontract? 
The solution can be obtained straightaway by the formula 


P(D) = P(A) P(D/A) + P(B) P(D/B) + P(C) PIC) 
= (0.41) (0.91) + (0.27) (0.42) + (0.19) (0.22) 
= 0.3731 4- 0.1134 + 0.0418 = 0.5283 


18. An urn contains 8 orange and 4 green balls. Two balls are 
drawn without replacement. What is the probability that the second 
ball is orange if it is known that the first is orange? 

Let B, be the event that the first ball is orange and A the event 
that the second ball is orange. 

2 balls can be drawn out of 12 balls (8 orange + 4 green) in the 
urn in 440 ways. 

2 orange balls can be got out of 8 in „с, ways 


Hence ва ST 14 
EC) = Tid, зз 


PROBABILITY 33 


Alternatively, the probability that the 1st ball is orange 
8 2 
P(B)—15—5 
and the conditional probability of the event А given that the event 
B has happened 


P(A/B) = Т 


аз when the first ball drawn is orange, then 7 orange and 4 green 
balls are left in the urn 
2 2' ЈА 
P(A, B) = P(B) Р(АЈВ) = 3°75 733 
19. Ina certain community it is found that the sex ratio among 
children is 3 girls to 2 boys. If a family of 5 children is picked 
up at random, calculate the probability that (a) the eldest is a 
boy and the rest are all girls (b) the first, third and fifth children are 
boys and the remaining are girls 


ВАЗЕ Зи 162 
(а) — p(b8888)—3' 59555-3125 
273:2 312 72 


(5) 11014213 ђ) = =:5'5'55 = 3125 


20. Suppose 3 bad bulbs are mixed up with 12 good ones, and 
that while testing one by one until all the defectives are found. What 
is the probability that the last defective is observed on the 6th 
testing? 

Let B be the event of finding 2 defectives among the first 5 tested 
and А be the event of finding the third defective on the 6th testing. 


Ч 3.2 12.1110 54321 20 
Р(В)-- РЕ LU EP eee „ш 
Hence PRO Зал SAA 


P(A/B) is the probability of finding the third defective in the sixth 
testing after B has happened. When B has occurred, there are 9 
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bulbs left of which one is defective and the probability of picking it 
on the 6th testing 


1 
БАЈЕ) = $ 
20 1 0 
Hence P(A and В) = P(B) Р(А/В) = бү = 2% 
21. 1614 and В be events with P(4U В) = Р Р(4(18)-- : and 


P(A) = 5 find (a) PA) (b) P(B) and (с) P4 0B) 


(а) PA)-1—R4)-1-$-3 
(b) P(AU В) = P(A) + P(B) — Р(АПВ) 
EE 1 
nagri 
© АПВ) = А) — 814) =5—1= 5 


22. Discuss briefly the axiomatic approach to probability, illus- 
trating by examples how it meets the deficiencies of the classical 
approach. 

А logical system approach to probability deduced from some 
axioms, using which some theorems are developed and from these 
probability can be obtained easily. 

Let ‘S° be a Sample Space and let А be any event in S, which in- 
other words mean that 4 is any subset of S so that P is called a 
probability function on the Sample Space. 


Axiom 1: P(A) is a real number such that P(4) > 0 for every A 
in the Sample Space 5. 


Ахїот 2: Р(5)--1 
The probability of the entire Sample Space is unity i.e. Р(5) = 1. 
Hence for any event A 

0<P(A)<1 
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Axiom 3: For any two events A, and A, the probability that 
either 4, or 4, or both occur is 


P(A, U А.) = P(Ay) + Р(А;) — P(A, А) 
If A, and A, are mutually exclusive, then 4,4, — 0 
so that P(A, U 45) = P(A)) + P(A3) 
Thus for events Ау, А», A5... Which are mutually exclusive 
P(A,U А, U A5...) = РСА) + РСА) + РСА) +... 
Theorem 1: Let 5 be the Sample Space, P the probability function 
on $, then the probability that the event 4 does not happen is 
1 — P(A) denoted by Р(А) 
From the definition of null set (a set S containing no points) 
АПА =0 and AUA=S 
and hence from axiom 2 _ 
P(S) = 1 = P(AUA) 
Again using axiom 3 
РАЦ A) = P(A) + P(A) 
Непсе P(S) = 1 = P(A) + P(A) 
л P(A) =1—P(A) 


Theorem 2: For any event A in the Sample Space 5 then 
0<P(A)<1 


It is clear from axiom 1 that P(A) > 0 and it is only necessary to 
show that P(A) < 1 


Since P(A) = 1 — P(A) 
P(A) + Р(А) =1 

but P(A) > 0 by axiom 1 

Непсе Р(А) =1— Р(А) <1 


Theorem 3: If S, is a null set then Р(5,) = 0 
S — S, — null set 


Р(5 5) = P(S)--P(S) asper axiom 3 
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but 505 = S and P(S) = 1 
Hence P(S) = Р(5) + P(S) 
so that P(S,) = 0. 


23. Let А, В, C be three arbitrary events. Find expressions for 
the events that of А, В, C. 

(a) only A occurs (b) both А and B but not C occur (c) all3 events 
occur (d) atleast one occurs (e) atleast two occur (f) one and no 
more occurs (g) two and no more occur (h) none occurs. 


Solution 
(a) only A occurs ABC 
(b) both А and В occur but not С ABC 
(c) all three events occur. ABC 
(d) atleast one occurs. AUBUC 
(e) at least two occur ABUACU BC 
(f) one and no more occurs ABCU ABCU ABC 
(g) two and no more is4BCUABCU ABC 
ie. (ABUACUBC) — ABC 
(h) none occurs ABC 
24. Show that 


P(AUBUC) = P(A) + P(B) + P(C) — P(AB) — P(BC) 
— P(AC) + P(ABC) 
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Consider the 3 sets of points 4, В and C and it is clear from the 
figure that the symbols ABC refers to set of points in A but not in 
B and C; САВ refers to set of points in C and А but not В and зо 
on. In all there are 7 mutually exclusive sets. 
Thus 
P(AUBUC) = P(ABC) + P(BAC) + Р(САВ) + P(ABC) 
+ P(BCA) + P(ACB) + P(ABC) 
From the figure it is clear that 
ABC = A — AB— AC + ABC; ACB = AC — ABC ete. 
P(AUBUC) = {P(A) — P,(AB) — P(AC) + (ABC) 
+ {P(B) — P(BC) — P(BA) + ABO) + {P(C) | 
— P(AC) — P(CB) + P(ABC)} + {P(AB) — P(ABC)) 
-+ {P(BC) — P(ABC)} + {P(AC) — P(ABC)} + {P(A BC)} 
= P(A) + P(B) +- P(C) — P(AB) — P(BC) — P(AC) 
+ P(ABC) 


1.12. EXERCISES 


1. What do you understand by the term ‘probability’? 
2. Explain the terms (a) mutually exclusive events (b) Independent 
events. (II РОС — Арг. 1975) 
3. Define ‘random experiment’ and ‘Sample Space’. 
4. Define mutually exclusive events and Mathematical probability. 
(ТРОС — Apr. 1983) 
5. Explain the following terms giving one example to each: 
(i) Equally likely events (ii) Exhaustive events (iii) Mutually 
exclusive events. (II PUC — Apr. 1984) 
6. Define a ‘Set’. Define the ‘union’ and intersection of two sets. 
(II РОС — Apr. 1984) 
7. Write the Sample Space for the following random experiments: 
(i) Two coins are tossed at the same time 
(ii) a coin and a die are tossed 
(iii) A box contains 3 red objects and another box containstwo 
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11. 
12. 


13. 


14. 


15; 


16. 
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green objects. A child picks up one object from the first 
box and one object from the second box. 

. Let S — (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12), 

А ={1, 2, 3, 4, 5, 6} 
and B= (4, 5, 6. 7, 8, 9). Write the elements belonging to 
(a) AUB (b) (AU B} (c) 410 B! and АЈ B! 

. In how many ways can 3 men and 3 women be seated at а 
round table if (а) no restriction is imposed (b) Two particular 
women must not sit together (c) each women is to be between 
two men? 

. From 5 doctors and 6 statisticians, a committee consisting of 3 

doctors and 2 statisticians is to be formed. How many different 

committees can be formed if (a) no restrictions are imposed 

(b) two particular doctors must be on the committee (c) опе 

particular statistician cannot be on the committee? 

State and prove the addition theorem of probability for any 

two events. (II PUC — Oct. 1978) 

If A and B are any two events such that P(A) = $, Р(В) =} 

P(AU В) = + then find P(A NB). (II PUC — Oct. 1978) 

In a certain city it is found that the percentages of persons 

reading the news papers ‘4’, ‘B’ and both “л and В’ are res- 

pectively 12, 25 and 8. Find the probability that a person 

selected at random from the city shall be a reader of paper 4 

or paper В. (II PUC — Apr. 1978) 

Write down the Sample Space for throwing two fair dice. Show 

that the probability of an event lies between 0 and 1. Also 

obtain the probability of complementary event of А. 

(II РОС — Oct. 1977) 


А box contains 50 screws and 40 nails, 5 th of the screws and 


50% of the nails are rusted. An item is drawn at random from 
the box. What is the probability that the item is neither a rusted 
item or a nail? (II PUC — Mar. 1977) 
State and prove multiplication theorem of probability for two 
independent events. (II PUC — Oct. 1976) 
Two dice are tossed. What is the probability that the total is 
divisible by 3 or 47 (II PUC — Apr. 1984) 


18. 
19? 


20. 


21. 


22. 


23. 


24. 


25, 


26. 
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What do you understand by *conditional probability'? 

Find the probability of drawing two red balls one after the 
other from a bag containing 4 red and 5 black balls when (i) 
the ball first drawn is replaced (ii) the ball first drawn is not 
replaced. (II PUO — Apr. 1975) 
A fair coin and a fair die are thrown. Find the probability of 
(i) head on the coin and the number 6 on the die (ii) head on 
the coin and even number on the die. 

What is the probability that a ball drawn from an urn contain- 
ing 3 red balls, 4 white balls and 5 blue balls will be white? 
Suppose that among doctors there are 12 fields of specialisation 
and that there is an equal number of doctors in each field. 
Given a group of 6 doctors, what is the probability that no 
two among them will have the same field of specialisation? 

A box contains 50 razor blades, 5 of which are known to be 
used, the remainder unused. What is the probability that 5 
razor blades selected from the box will be unused? 

Consider a family with two children. Assume that each child 
15 as likely to be a boy as it is to be a girl. What is the condi- 
tional probability that both children are boys given that (i) the 
older child is a boy (ii) atleast one of the children is a boy? 
Let А, B, C be three arbitrary events. Find expressions for the 
events (1) only А occurs (ii) all 3 events occur (iii) atleast two 
occur (iv) not more than 2 occur (v) none occurs. 

In a certain family 4 girls take turns at washing dishes. Out of 
a total of 4 breakages, 3 were caused by the youngest girl, and 


_ she was hereafter called clumsy. Was she justified in attributing 
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the frequency of her breakages to chance? 

A manufacturing process produces 4% defective items. Ex- 
perience shows that 257; of the defective items being produced 
are missed by the inspector. Good items always pass inspection 
satisfactorily. What is the probability that if you buy one of 
these items it will be a defective one? : 
A random experiment consists in drawing а card from а ordi- 
nary deck of 52 playing cards. Let the probability set function 


P assign a probability of = to each of the 52 possible ощ- 


comes. Let C, denote the collection of 13 hearts, and let С, 
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34. 


35: 


36. 
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denote the collection of 4 kings compute P(C,), P(C.), P(C, ПС) 
and P(C,U C,). 

There are 4 choices in an objective type of question, only one 
of which result in success. What is the probability of ‘Success’ 
in a given trial? What is the probability of failure in a given 


choice? 

In a question paper there are 10 objective type questions each 
with four possible choices only, of which only one is correct. 
What is the probability of answering five questions correctly? 
What is the probability of getting a sum 7 or 10 when two un- 
biased dice are thrown? (ПРОС — Apr. 1984) 


2. Three groups of children consist of one boy and 3 girls; 2 boys 


and 2 girls; 3 boys and one girl respectively. One child is select- 
ed at random from each group. What is the probability that 
the selected group consists of 2 boys and one girl? 

(II PUC — Apr. 1983) 
Three cards are drawn from a pack of 52 cards. What is the 
probability of gettidg an ace, a queen and a king? 
What is the probability of getting 2 heads and one tail in three 
tossings of a coin? 
5% of the screws produced by a machine are defective, What 
is the probability that out of 5 screws chosen at random there 
are (a) no defective screws (b) at least one defective screw? 
What is the probability of obtaining a sum of 6 exactly twice 
when a pair of dice are thrown 3 times? 
A firm has on its rolls 60 executives, 100 supervisors and 1000 
workers. What is the probability that an employee chosen at 
random is an executive or a supervisor? 
The probability that a man will be alive in the next 20 years is 
2 and that his wife will be alive is 3 What is the probability 
that in 20 years time (a) both will be alive (b) atleast onc of 
them will be alive? 
A rocket has an electrical unit that must function properly for 
ignition to take place. The unit is known to work 9875 of the 
time. If a backup unit is available, what is the chance atleast 
one of them will work properly? 
A property tax rate increase was proposed and it was found 
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that 60% of the property owners opposed it, while 80% of the 
non-propery owners favoured it. If 70% of the voters are pro- 
perty owners, what percent favour the tax increase? 

Three lots of items, contain 10%, 20% and 307; defective items 
respectively, one item is drawn at random from each lot. What 
is the probability that among the three items drawn there is: 
(1) exactly one defective (ii) at least one defective item ? 


] : 1 
The chances of winning of two race horses are 3 аш respect- 


ively. What is the probability that atleast one will win when 
the horses are running (2) in different races and (b) in the same 
race? р 

A bag contains 10 white, 6 red, 4 black апа 7 blue balls. Five 
balls are drawn at random. What is the probability that two 
of them are red and one black? 

In a music class there are 6 boys and 5 girls. Two students are 
selected from this group (a) what is the probability that the 
first student selected is a boy? (b) What is the probability of 
both being girls? 

The probability that seeds of a particular type germinate is 
0.85. The probability that a seed grows into a plant to produce 
a pink flower is 0.4. What is the probability of obtaining a 
pink flower from a seed selected at random? 

“47 can hit a target in 3 out of 5 shots, ‘B’ in 2 out of 3 shots 
and ‘C’ in 1 out of 2 shots. If all the 3 fire a shot simultane- 
ously, find the probability that none hits the target. 

A bag contains 8 white and 5 green balls. Two balls are drawn 
at random. Find the probability that (i) they are of same 
colour (ii) they are of different colours. 

Prove that P(AB) — P(B) P(A/B). Modify the result if events 
A and B are independent. 

How many terms are there in (a + by» and expand. 

It was found that 60% of the students of a college are smokers. 
If4 students from this college are interviewed one after the 
other, what is the probability that (a) exactly one of them is 
a smoker (b) exactly two of them are smokers (с) first опе of 
the two is a smoker. 


42 
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. If A and B are independent events with P(A) = = and РВ) = 4 
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3 
4 
what is P(A + B)? 


. IfP(A) = 5 P(B) -i and P(A + В) = 5 what is P(B/A)? 


. If A, B, C are three events, prove that 


БА + B + C) = P(A) + P(B) + P(C) — P(AB) — РАС) 

— P{BC} + P{ABC} 
If P(A) =0 31, P(B) = 0.22 and P(C) = 0.13 and А, B, C are 
independent events, find the probability of occurrence of atleast 
one of the three events A, B and C. 


2. RANDOM VARIABLE 


24 RANDOM VARIABLE AND ITS DISTRIBUTION 


Random Variables: The quantity’ ‘у’ is called a function of thereal 
number ‘x’ if for every ‘x’ there corresponds a value ‘у’. A function 
defined on a sample space is called a random variable. The word 
random or chance is used to designate variables which for random 
experiment depends on the outcome of the experiment which ob- 
viously depends on chance. Hence the random variable is sometimes 
referred to chance or stochastic variable too. 

Consider an experiment of a throw a pair of coins. There are 
four likely events namely HH, HT,TH, TT so that the Sample 
Space comprise of 4 points. Let x be a random variable which can, 
assume the values Хү, Xo Xs and x, where x; the number of heads 
in S; 

points in Sample Space ss’: 5, 5, 5, 5, 

х(5) 57752: М: ПА] 
Hence а random variable is nothing but a numerical valued func- 


tion defined on Sample Space. 
Let us consider the experiment of rolling two dice. The following 


is the list of all possible outcomes: 
11 21 31 41 51 61 
12 22 32 42 52 62 
13 23 33 43 53 63 
14 24 34 44 54 64 
15 25 35 45 55 65 
16 26 36 46 56 66 


The Sample Space of 36 points (2, 3, 4, 5, 6, 7...12) and these 
are the values of random variable x, which can assume any of the 
values 2, 3, 4...12. Hence the value of the random variable .x 
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depends on the particular sample point chosen. Hence x is a func- 
tion of the Sample points of the Sample Space. 

Random variables are divided into two classes namely discrete 
and continuous. А random variable is said to be discrete if it assumes 
only finite or denumerable number of values whereas continuous 
random variable is the case if it assumes continuum of values. 

Thus ‘x’ is said to de discrete random variable if it is a random 
variable which assumes only finite number of values. A random 
variable ‘x’ is said to be continuous if it can take all possible values 
between certain limits. 

Since the word random variable tends to mislead, thc term 
random function ts used. 

Let X be a random variable and let the values which it assumes 
Бе Хү, х5, хз... The aggregate of all Sample points on which X 
assumes a fixed value x; forms the event that X — xj, the probabi- 
lity of which is denoted by P(X = xj), then the function p(X—xj) 
=f(x;) is called the probabiliiy distribution of the random variable 
X. Thus f(x;) > 0 and $ f(xj)) = 1 

Inthe example of rolling a pair of dice, the probabilities of getting 
2,3, 4...12 may interest us wherein X denotes the total number of 
values obtained in rolling two dice. For any one of these values 
which X can take, the prodability can be determined. If f(x) is 
denoted to be the probabllity that X — x, then f(x) is called the 
probability distribution of X. This can be displayed either in the 
form of a table or graph. ^ v 

Probability of total number of values obtained in throwing 2 dice 


Жш 2,059,1 045075; 16,857, 8, 92) 10.” 11,512 


Egi eje sca 2203 
(0: 39^ 39 36 36 36 36 36 36 36 36° 36 


Illustration 

Three marbles are drawn without replacement from an urn contain- 
ing 4 red and 6 white marbles. If X is a random variable which de- 
notes the tota] number of red marbles drawn (а) construct a table 
showing the probability distribution of X (b) represent the distri- 
bution graphically and (c) find p,(X — 2) and DÁl « X «3) 


p{X = 0) = р {пої red or ай white} = -5Cs. ав! 
m6 6 
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X = 3) = p, (all three red} = = 
pi }=p{ }= Cs 30 


Similarly p(X — 1) — 5 1 and РАХ =2} 10 


(а) Probability distribution table is 


ГЭРЭГ 
| 

1 VA 273 1 

ro M - 5-4 s | 7 


(b) 


x Fig. 10. 


(© {0 = 2) = probability of drawing a total of 2 red marbles 


3 
~ 10 
р{1 < x < 3) = probability of drawing 1, 2 or 3 red marbles 
15.5 
тэгс 


22, CONTINUOUS RANDOM VARIABLE 
If variable X assumes a continuous set of values it is called a conti- 
nuous random variable. Here we are dealing with intervals rather 
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than individual points. Thus the random variable X will be called 
a continuous random variable if there exists а function f(x) such 
that f(x) > 0 for all x in the interval — oo < х < -- со, such that 
for any event А 


P(A) = p(X is in A) = 110) dx. 


f(x) is called the density of X and referred to as Y is distributed as 
Дх). 

Consider a continuous curve У — P(X). The total area bounded 
by the X-axis is equal to one and the area under the curve between 
X =a and X = b given the probability that Х lies between а and 
b, i.e. р{а < X < b) p(X) is called the probability density function. 


р(х) 


Fig. 11. 


2.3. MATHEMATICAL EXPECTATION 
The average for a probability distribution is normally termed as 
expectation. 

For the purpose of illustration, consider a gamble wherein a per- 
son gets Re 1 when he gets 1 in a throw of a die, Rs. 2 for 2 and so on 
and if this is repeated sufficiently large number of times say 3600 
times so that approximately 1 appears 600 times, 2 appearing 600 
times and so on so that the average the gambler is expected to win 
per toss is 
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600 |, 600 „60 „60 , 600 , ç 600 


* k BE; ДА 1-3. 4 
Ве 1-3600 ' 23600 ^ 73600 ^ 3600 > 3600 ' ^ 3600 


1 1 1 "Ч 
=1-5 +2: qt 614615666 
This is the average ог mean value of the outcome of the throw of 
a die say x and denoted by E(x). 
If Y denotes the discrete random variable which can assume 
values Xy Xo, X, ан x with respective probabilities ру, рз, Ps--.i 
where р; + p» + Рз + p; = 1, then the mathematical expectation 


of X (or викна of *Х”) is defined as 
E(X) — Хр + X, Do + Xs pa +... + Xi Di 


1 
= Ð Xk Pk 
kel 


Consider the Arithmetic mean for a frequency distribution and their 
relationship with respect to the expectation. 
If the probabilities p; in the expectation are replaced by the 


relative frequencies ЕД where Z f; = М, then the expectation 


о) 2 zc 


which is nothing but the arithmetic mean of X ofa Sample of size 


N wherein X,, X, etc. appear with АА etc. as relative frequen 


cies. 


Another Illustration 
Consider a man who buys a lottery ticket that sells 100 tickets and 


that gives 4 prizes of 200 Rs. 10 prizes of 100, and 20 prizes of 10 
Rs. How much should the man be willing to pay for a ticket in this 
lottery? 

The expectation say 


E 10 20 
i^ 100 Rs. Too + 10 Rs. то | = 20 Rs. 
The average prize per ticket is Rs. 20 and hence this is the amount 
a man should be willing to pay for a ticket. 


E(x) = [2% Rs. į 


48 PRE-UNIVERSITY STATISTICS-II 


Thus, the expected value of a random variable or any function of 
a random variable can be obtained by finding the average value of 
the function over all possible values of the variable. 
Some fundamental theorems of Expectations 

ТЕ E(aX) = aE(X) 
If X can take all possible values X,, Х,...Х; with probabilities 
Py Р...рг, then as per definltion of expectation 


1 
E(aX) = 5 а-Хь pk 
k=1 


i 
=а7 X, py = a E(X). 
k=1 


2. If X,, X,, X,...X, are random variables with expectations, 
then the expectation of their sum is equal to sum of their expect- 
ations. 


E(X, +- Xa +...Ха) = Е(Х,) + E(X) +...Е(Хь) 
Consider only two variables Х and У. 


ЕХ + Ү) = E pij Qa + i) 


n m n m 
=> ®рух- 2 Уруу) 
ї=1)=1 ї=1)=1 


= x + Xx 
Xi pi E J PJ 


i=] 


= E(X) + E(Y) 


Another method | 
Consider two random variables X and Y defined on the same sample 


space and these may assume values x, xs...and ул, Уз, Уз...ап4 their 
corresponding probability distributions are denoted by (Мом) and 
501}, then the aggregate of points in which the conditions X — x; 
and Y = ух are satisfied forms an event whose probability is denoted 
by p(X = x;, Y = yy). This function p(X = ху, Y = ув} is called the 
joint probability distribution of Х and Y and denoted by p(x;, ук). 


Thus p(x), yy) 20 and (Хн ук) =1. 


RANDOM VARIABLE 49 


Consider i 
E(X) + E(Y) = Ex, р(х», Ук) + E» р(х), ук) 
the summation extending over all possible values of ху, ук so that 
their sum after rearrangement is 

E (ху + Ye) р(х» Уу) = E(X + Y) 


3. If X and Y are mutually independent random variables with 
finite expectations, then their product is a random variable with 
finite expectations and E(XY) — E(X) (E(Y), i.e. the mathematical 
expectation of the product of 2 independent random variables is 


equal to the product of their expectations. 

Let the random variable assume the values xj (j = 1, 2...) with 
respective probabilities p; and let Y another random variable assume 
the values yy (k = 1, 2...m) with respective probabilities рг, 


Then as per definition of expectations. 
n m 
E(X) = Бр) x) and E(Y) = D p" Yy 
181 = 
xy, Y = у = P(X = xj PLY = ук} 


Let Pik = РХ 


Dik = Pi Ру 
The product XY is also a random variable which can assume m x n 


values of ху ук (i = 1, 2-7, k = 1, 2...т). 


Б» = 2 Ex у 
Then (ХУ) A VEDI m om J Ук Py Pk 


— хүр, È yepe = ЕХ) ЁО) 
је! к=1 


In general Е(Х-У.7...) = Е(Х)-Е(Ү) E(Z).... For more than two 


independent random variables. 

5. Еа) —a that is the expected value of a c 
Е(а) = Хар —aZp —4 1 
E(aX + b) = aEK(X) + b. 
and a and b are constants 


onstant is constant 


50 that ( 
where X is a random variable 


E(aX + b) = X (ам +) т= Хамр- бр 
1-1 ї=1 i=l 
=a% тр +b Èp —aE(X) +b 
i=1 = 


Variance: The variance ofa probability distribution can also be 
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represented by the expected value, as in the case of the mean. 
k 
Ех — в) = е = p)? p(x) = Ех?) — {Е 


Covariance: If Y and Y are two random variables then the covari- 
ance between them is 
Cov (X, У) = ERX — EO) (Y — ЕР 
= E(XY) — EQOR(Y) 
If Х and Y are independent then E(XY) = E(X) E(Y) so that 
Cov (X, Y) —0 


2.5. EXAMPLES: 
1. Two unbiased dice are thrown. Find out the expected value 
of the sum of the points on the two dice. m PUC-March 1974) 
2 


1 1 
E(X) = р х= 2 a .. + 12:46 


= 35252) =7 


since the numbers on 1st and 2nd dice and the probability function 
of X can be 


Total points 05559; £245 257 Ox 9/5. 8:89: 108 И 12 


e eim i2 0522 24:25: MOINES AS из 2" f 
Probability Be 36 36 36 36 36 36 39 36 36 36 


2. A charity affair sells raffle tickets for Rs. 10 each. There are 
10 prizes worth Rs. 20 each and one bumper prize worth Rs. 1000. 
If 300 tickets are sold out and a person buys one of them, what is 
his expectations? 

Here the possible values or prizes one can get of x are nil, Rs. 20 
and Rs. 1000 with their respective probabilities 8, Da nd A 


289 10 1 
Hence E(x) = 0- :500 ^ + 20 Rs. 300 ^ + 1000. 3097 Rs.4 


3. A bag contains 6 tickets numbered 1 to 6. A person draws 
two tickets at random. If the sum of numbers on the cards drawn 
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iseven, he gets Rs 10, otherwise he loses Rs 5. Show that the 
expectations of his gain is Re 1 only. (II P U-Apr. 1978) 


Two tickets can be selected out of 6 in «с, ways or 28. EUG 


The number of favourable cases for the event of the getting of 
sum to be even are (1, 3) (1, 5) (2, 4) (2, 6) (3, 5) (4, 6) or 6 ways. 


Hence the prob. of getting even number — B 2. 


so that, the probability of getting the sum as odd number is à 


The expectations of his gain 
E(x) = (Rs. 10) (Prob. even) + (Rs. — 5) (Prob. odd) 


= RS 102 Rs. —55—4—3—Re.1 


4. Ifit rains, an umbrella salesman canearn Rs. 300 a day. If 
it is fair he can lose Rs. 60 per day. What ishis expectation if the 
probability of rain is 0.3? 


probability of rain = 0.3 hence 
probability of fair weather = 0.7 i.e. (1 — 0.3) 
Expectation = (300 Rs.) (0.3) + (— 60 Rs. (0.7) 
= 90 — 42.00 = Rs. 48.00 
5. Find E(X), E(X?) and E[X — Х)? for the following probability 
distribution: 


| 
| 
| 


X: 8 12 16 20 24 


соі = 
со ~ 


P(X): 


2841 1. 16.3 1 1 
Е(Х)= BX p(X) = 8:5 + 12-6 16:5 + 20:2 24-15 


—14+24645+2=16 
E(X} = ZXip(X) 
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d. sla 122.1 85 163 P 208.5 r a 
= 8 + 24+ 96+ 100 + 48 = 276 
E[X — Xy] = Z(X — Xy-p(X) = E(X?) — [Е(Х)Ј? 
= 276 — 16: = 276 — 256 = 20 
which is the variance of the distribution. 

6. What is the Standard deviation of X if E(X?)— 225 and 
Е(Х) = 9. (II PUC—A pril 1984) 
Variance of X = E(X?) — [ЕХ)] = 225 — (9)? = 225 — 81 = 144 

S.D of X = y variance of X = 4/144 = 12 

7. Given the following probability distribution 


х 
| 
2 
© 
| 
| 
~ 


1 1 
P(X) 5 10 10 


мн 


Find the value of Е(Х) and Var (X) (II PUC—Apr. 1984) 


1 1 
ЕЮ = pi C7 1-5 0-45 1 +2. 


vN 


Ка oos Oe 
suci p s TIS Пр |: 
E(X?) = EX? pi 


1 1 3 2 
= (7*3 0 79 (08:0 + O3 


а By ЕС 
See igs 0c = 10 
21 (9y _ 129 
Var (X) = ED — (ЕСО = 1 — (16) = (0)-129 


8. Мг. John tells his son Edward *Throw a die. You will have 
the pocket money equivalent to two times of the face value that 
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occurs and in addition to this rupees 5" Find the expectation of the 

pocket money that Edward would get. (II PUC—March 1977) 
The solution for this problem can be obtained by E(ax -- Б) 

= a E(x) + b where ‘a’ the constant which is 2 times the face value 

a=2andb=5 

7 


1 
Е) = рл (1 +2+3+4+5+0-2 7 


Expectation of Edward's pocket money = 2-1 +5=Rs, 12 


9. Ina lottery 2 tickets are drawn out of 10 tickets numbered 
from 1 to 10. Find the mathematical expectation of the sum of the 
numbers on the tickets drawn. (II PUC—Oct. 74) 

Let S = X, + X, where X, and X, be the numbers on the two 
tickets drawn. 


ЕХ) =U +2+3+4+5+6+7+8+9+ 10) = 55 


since the number that is drawn can be any one of the 10 values 


1, 2, 3...10 each with probability 15 


Similarly E(X,) = 5.5 
Hence E(X, + X) = E(X) + Е(Х)--554-555-11 


10. A random variable assumes the value 1 with probability p, 
and zero with probability g = 1 — p, Prove that 


(а) E(X)=p and (b) EX — Xy] — pq 


(а) E(X) =Z xipi = 1-р + 0-94=р 
(b EX- Х) = Е) — LEQOP 
But E(X?) = Ў X? p, = 12-р -0:q—p 
Hence E[(X — X] =p — (Р)? = p — p) = pa 
since Ч 


11. Two coins are tossed. ‘A’ will receive Rs. 50 if the outcome 
is 2 heads and Rs. 10 if the outcome results in one head and it costs 
‘A’ Rs. 60 if the outcome is zero heads. Should he undertake such 
a venture? 
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The probability of getting two heads — 4 which is the probability 
of winning Rs. 50 
One head will occur with probability 4 and for zero head the pro- 


bability is 1. 
Hence Expectation = 4 (50) + 3 (10) — { (60) 
10 
ae Rs. 2,50 p 


Since the expectation is Rs. 2.50, it is worthwhile to play the game 
unless sentiment come in the way. 
12. Let Y have the probability density function 
Јо) = e х=1, 2, 3 =0 elsewhere 
Find variance of X and E(X?). 
Variance of X = E(X?) — LE(X)]* 


= м јод — ЊЕ fo) I 


хеј хеј `6 
TTS 27-а ба 
т. gus 1 
-3-(Ё|-38-8 
ее), 3529 
3 зох 1,16, 81 98 49 
= 3. — з._—-_- |" а 
EQN 8 АЦО И и GT GEARS 3 


„ 13. Suppose ве probability is 0.6 that Vijay Amrithraj on a tour 
will win any single match against his touring opponent. The tour 
consists of 5 matches. Let x denote the number of matches this 
player will win before his first defeat occurs. Find the distribution 
of x and calculate the mean and variance of x. 


x (matches) P(x) 
0 0.4 = 0.4 
1 (0.6) (0.4) = 0.24 
2 (0.6)? (0.4) = 0.144 
3 (0.6) (0.4) = 0.0864 
4 (0.6) (0.4) = 0.0518 
5 (0.6)? — 0.0778 


Mean = E(x) = X xi pi 


ася 
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Mean = 0(0.4) + 1624) + 2(.144) + 3(.0864) + 4(0518) + 5(.0778) 


= 1.3840 


Variance of x = E(X?) — [E(X)]* 


2.5. 


AU Non 


11. 


12. 


= 0(.4) + 1(.24) + 4(.144) + 90864) + 16(.0518) 
-+ 25(.0778) — (1.3840)? 


= 0.24 + 0:576 + 0.7776 -+ 0.8288 -+ 1.945 — 1.9155 
— 2.4516 


EXERCISES 


. Define a random variable and its distribution. 

. Explain discrete random variable with an example. 

. What do you understand by *continuous random variable’. 

. Explain the terms probability density function and density 


function. 


. Define a random variable and its mathematical expectation. 
. Aand B play a game of tossing a coin and he who first throws 


a head wins the game and the game terminates. If А beginsthe 
game and each player wins an amount of money equal to the 
number of tosses required for the win, find their respective 
mathematical expectations. 


. State and prove the addition theorem of expectations. 
. If ‘a’ is a constant, find КТ) 


. If X and Y are random variables prove that 


Е(Х + Y) = E(X) + EQ?) 


. Prove the following: 


@ EX — Y) = KX) — EQ) 

(ii) Ее) = e, where e is a constant. (П PUC—Oct. 1977) 
If X and Y are independent random variables show that 
E(XY) = E(X)- EQ?) 

<A’ enters into a competition of hitting a target. If he hits the 
target, he gets Rs. 10, otherwise he has to pay Rs. 5. If the 
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15. 


16. 


18. 


19. 
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2 what is his mathemati- 
cal expectation? (II PUC—Oct. 1976) 


probability of hitting the target is 


. If *p' buys a ticket in a lottery wherein there is one prize of 


Rs. 1000 and 10 prizes of Rs. 100, and if 10,000 tickets are 
sold, what is his mathematical expectation? 


. There are three identical-appearing envelopes in a drawer. One 


contains two 10 rupee notes and one 100 Rs. note, the second 
contains one Rs. 10 note and two Rs. 100 notes and the third 
contains three 10 Rs. notes. If a person is allowed to pick one 
envelope and draw one note from the envelope without looking 
at it what is his expectation? 

Find E(X), E(X — X)? and E(2X + 5) for the following pro- 
bability distribution. 


хайж 4 8 12 16 20 
p(X): 1/5 1/8 1/6 3/8 2/15 


If X is the random variable showing the number of boys of 
family with 4 children, the probability distribution of Y of which 
are given below obtain E(X), variance of X and E(X Эр 

X. s 0 1 2 3 4 
P(X): 1/16 4/16 6/16 4/16 1/16 


. Prove that E(2X + 7) = 2E(X) + 7 and 


E[(X — Xy] = E(X?) — [Фр 


For the following probability distribution obtain the mean and 
Standard deviation. 


x s 0 1 2 3 4 
ї "b 1 3 1 1 
Р(Х) 8: 16 4 8 4 16 


Find the expectation and variance of the number of successes 


in a series of ‘n’ independent trials, the probability of the 
success in the ith trial being ри. 


20. 


21; 


22. 


28: 


24. 


25: 
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In a lottery ‘и’ tickets are drawn at a time out of tickets num- 
bered 1 to N. Find the expectation and the variance ofthe sum 
‘S’ of the numbers on the tickets drawn. 
Show that Cov (aX + БУ, cX + dY) = ac W(X) + bd у(у) 

+ (ad + bc) Cov (X, Y) 
Express Karl Pearson's correlation coefficient formula in terms 
of expectations. 
А person draws 2 balls from a bag containing 4 white and 5 
red balls. If he is to receive Re 1 for every white ball which he 
draws and Rs. 2 for each red ball, find his expectation. 
Given the probability function 


Хү. 0 1 2 3 
px) 0.1 0.3 0.5 0.1 


Let Y — X? 4- 2X, then find the probability function of Y and 
the mean and variance of Y. 

If Y and Y are independent variables with means 10 and 20 
and variances 2 and 3 respectively, find the variance of 
4X 4- 5Y. 


3. Probability Distributions 


31 BINOMIAL DISTRIBUTION 

The Binomial distribution is a distribution associated with repeated 
trials of an experiment. Repeated independent trials are also known 
as Bernouilli trials, wherein there are only two possible ontcomes 
for each and every trial and the probabilities of which remains the 
same all through the trials. Normally the actual number of successes 
produced in a successive ‘и’ Bernouilli trials are needed but not the 
order in which it occurs. 

Consider any random experiment and also of every event say 4 
for that particular experiment, wherein probability of occurrence of 
the event P(A)=p and the probability of non occurrence of the event 
р(4)=1—р ога. Suppose that this experiment is repeated ‘m times 
independently and let x be the number of times the event А occurring 
so that the Sample Space S= (0, 1, 2...(n— 1),n), then the probability 
distribution of x is known as the Binomial distribution. 

The number of successes can be 0,1,2,3...n, in the case of the 
sequence of n repetitions of the experiment and the next step 18 
to determine the corresponding probabilities. The probability that 
the first ‘x’ repetitions produce ‘A’ and the remaining (n—x) produce 
A is 

р-р.р -..р4 sq = py gh 
The event of п trials resulting in x successes and (п—х) failures can 
happen in n^x ways, so that the probability that there are exactly 
x events is п°хрхд"—х and probability function 
f(x) = пехрха"-х; x—0, 1, 2,..-nis called the Binomial distribution. 
obviously the probability of no success is 4” and the probability of 
at least one success is(1—q") 

Suppose a coin is tossed ‘n’ times and if it is desired to know the 
probability of obtaining exactly ‘x’ heads. The set of resulting tosses 
can be of the form 


адарарррраратррадарар... 
The probability of obtaining a head in a single toss is } and the 


~ ти 
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number of favourable arrangements is „сх and the total number of 
ways in which the tosses may turn out is 2х, as for each of the x 
independent tosses there exist two possibilities. Hence the probability 
of getting exactly x heads is 


пбх о _ п! m 

25 x!(n—x)! 12 
The Binomial distribution is one of the most frequently used discrete 
distributions, in practice. This is known asa discrete distribution 


since x is an integer between 0 to и inclusive. 
For x = 0, 1, 2...n, let us consider the probability distribution 


x Дх) 

0 №3 р? 4"—° = 4" 

1 лс, pd" = пра"! 

2 ES Р q" = педра" “> 
3 „бара gs = пЗ p 47-3 
п ср" =” 


> Јо) = a" ясү4"-1р + пса" pa + сз" P. +... Hp" 


=(¢+p)"=!1 


since the events A or A must happen р(4) + p(4) = 1 and p(A) = p 
and p(A) = 4 and the terms on the right hand side are the expan- 
sion of (а + p)" by Binomial theorem. In other words, the succes- 
sive values of the probability function are the successive terms in 
the expansion of (4 + р)", by Binomial expansion hence the reason 
for calling this as Binomial distribution. 


3.2. MEAN AND VARIANCE OF THE BINOMIAL 
DISTRIBUTION 


Mean = E(X) = ix fe» 
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n 
= = x-n°x рх 47-х 
х=0 


a n! 
2 Vis 3 “pepe. quo 
22 n(n — 1)! v 
2227 “бу 6-1) (п--х9Г ^? 14" х 
$2 (п — 1)! HOM 
= npo (x— Da =) ерх-їдт-х 
= пр (4 +p)" since (g 4- p) 2 1 
=np 


Variance 
= E(X*) — [Е(Х)]° 


л п! 5 
E(X?) ен . OEN «рхдт-х 


ete sh ._ n(n—1)(n—2)! a 

"hber e se Tr 
since 

x? = x(x —1) +x 
к n(n — 1)(п — 2)! 
xx — x — их) 
ap. n(n—1)! CM. 
T *3XG— D — xy? Р Agn—x 

= п(п — 1)р%4--р)"—° + пр(а + py 
= n(n — 1)р? + пр 


E(X?) = У = 2.pn-29n-x 
a» = 5 [3-5 poro 


Hence, the variance of the Binomial distribution 
o = E(X?) — [ОР 
= п — 1)p? + пр — пёр? 
= пёр? — np? np — пар“ 
= np (1—p) = пра 


Similarly the third and the fourth moments about the mean can 
also be obtained 
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Thus i; = npq(q — p) and ва = пра[1 + 3(n — 2)pq] 
в _ n*p*q*(g — р)? 


and hence f, 


ра? n? ред“ 
(a — р» 
npq 
ga _ пра[1 + 3(n — 2)р4] 
d и npg? 
(1-6ра) 
3 а 
Шан 


3.3 ILLUSTRATIONS 

(1) A manufacturer of a certain parts for automobiles, guaran- 
tees that a box of his parts will contain at most two defective items. 
If the box holds 20 parts and experience has shown that his manu- 
facturing produces 2% defective items, what is the probability that 
a box of his parts will satisfy the guarantee? 

This may be considered to follow Binomial distribution wherein 


1 —20 and p =үду = 9 02. 


Here the probability that atmost 2 defective parts namely p(x « 2) 
is to be determined 
! 


n п—х 
PO) = zm — ху? xq 
pO) = тот Од 02. (0.982 — 0668 


20! 
А0) = нар = ту C02 C98 = 0.273 


70) ЕЛ 0:02" C99 = 0.053 
Hence p(x < 2) = р(0) + p(1) + Ра) 
= 0.994 


which shows that the manufacturer’s guarantee is almost satisfied. 
2. Six coins are thrown 100 times, What is the probability of 
getting (i) 2 heads (ii) at least 3 heads. (11 PUC-March 1977) 
N 
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р(х) = пех рха"-х Неге, п = 6 andp= q= à 
522 
Probability of getting 2 heads with 6 coins is 
6.5 15 
РО) = c Q** = 31-0 = a 


(1) р (at least 3 heads) = 1 — [p(0 + p(1) + p2)] 


-1-Їй 221 
64| 733 


3. The probability that a student enterin i 
Statistics and Mathematics as optional Жонн E она 
course is 0.37. Determine the probability that out of Aa 
(a) попе, (b) one and (с) at least one wil] complete the cou 

(a) p (none will complete the course) = (су, 37) ( ey 


= 0.0248 
—46(.37) (.63)5 
= (6) (0.37) (0.0992) 


; = 0.2203 
(c) p (at least one will complete the Course) = 1 
Pare а 


(b) p (one will complete the Course) 


(попе will 
complete) 


=1—0.0248 
: Энд =0.9 
4. Ап unbiased Coin is tossed 15 times, Е; A 
that there will be at least 12 heads, "2104 the probability 


As the coin is unbiased p — ER 22 (M PUC-Oect. 74) 
12 heads out of 15 throws 4 Probability of getting at least 


1y2 /1\15-12 
иен (2) 0) Г 10613 (5) (| ‘a Cu PROD 
2 


-( 


2 


! 


ок 


15 
) m F 15613 + i + 23 


15 
) (455 4105-41541) 576 


Nie 
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5. Find the probability that in a family of 4 children will there 
be (a) at least 1 boy (b) at least 1 boy and 1 girl, assuming that the 
probability of a male birth is $. 

(а) pr (at least 1 boy) = р, (1 boy) + p (2 boys) + р (3 boys) 


+ р (4 boys) 
LANs РА Тү. (13, IM 
-e(36) 06) (606) * =) 
4 6 1 1 
Tet 16 ЭЛ6 
1. 13. 1 15 
= ТЕЛА ТӨ ГД 
(b) р, (at least 1 boy and 1 girl) = 1—p, (no boy)—p, (no girl) 


3.4 FITTING BINOMIAL DISTRIBUTION TO 
OBSERVED DATA 

The theoretical distributions usually depends on the unknown para- 
meters and if these parameters are specified for theoretical distri- 
bution completely, the expected or theoretical frequencies can be 
computed, 

Tn order to calculate the successive values of f(x) or p(x) in the 
Binomial distribution, the following method is used. 


fe) = 9-5 x)! VoU = (1) 
| 
hence fix 4-1 == Di icm архі 
Де п! ха — x)! рх+т-х-1 
Лх) (х р(т—х— 1)! пі peque 
ГУ: п— х 1110-х Р 
Е 1 i х-1 4 
р 
Thos Ј бе -- Df c 2 © 


Using (1) Д0) = а" 


By using relations (2) f(1) = f) & Fe E t= а" (n—0) р 


= пр "1 
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Similarly f(2), f(3)...can also be obtained. The relation Дх +1) 


= Қ): Бет is known as recursion formula. 


Example: A biassed coin was tossed 4 times and the whole experi- 
ment was repeated 150 times. The following frequencies of 097122: 
3, 4 heads were obtained. 


number of heads: 0 1 2 3 4 
frequency : 10 37 58 34 11 


Fit a Binomial distribution to the data 


x ji Ж 
0 10 0 i 
1 37 37 
2 58 116 
3 34 102 
4 11 44 
Total 150 209 | 
Меап = =? = 1.9933 

Mean of the Binomial distribution 

is np 

ie. np — 1.9933 

or 4p — 1.9933 

іе. р = 0.4983 

апа q = 1 —p = 0.5017 


p(x) = nCx р" 4" 
Роги = 4, р = 0.4983 and 


9 = 0.5017 the individual robabiliti 
are obtained by substitution RU 
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Probability Expected 
Мо. of heads (р(х) = atx p* "75) frequency 
0 p(0) = ась (.4983)) (.5017)! = 0.0634 9.51 
1 p) = 4с, (4983) (5017) = 0.2517 37.755 
2 DQ) = ас, (4983 (5017) = 0.3750 5625 
3 p) = 1с, (49839 (.5017) = 0.2483 37.245 
4 p(4) = «са (4983)! = 00616 9.24 
Total 1.0000 150 


Expected frequency = Np(x) 


No. of heads 5 0, 1, 2) 35 4 
Observed frequency: 10, 37, 58, 34, 11 
Expected frequency: 10, 38, 56, 37, 9 


(2) Fit a Binomial distribution to the following data 


X 0 1 2 3 
f: 30 62 46 10 
2X _ 62 {+ 92 + 30 +8 192: 
Меап = EIE 150 574150 = 1.28 
р = нэ — 0.32 and hence 4 = 0.68 


P(X) = асх (0.32)* (0.68)*-х 
р(0) = geo (.32)° (.68)1-^ = 0.2138 
Using the recursion formula р(1), p(2)...can be obtained 


wx +1) = обл 
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4 0.32 
р(1) = р(0) 16.68" (02138) (1.8834) = 0.4025 


Р(2) = p(1) (т = i ) (oss) = (0.4025)( 3) (0.4706) = 0.2841 
4—2 [.32 2 
Р@) = 00)5 (8) = (0.2841) 3(4706) = 0.0891 
AC ES 1 
p(4) = KOES | 2) = (0891) 4 (4706) = 0.0105 
Sen ta 1 2 3 4 Total 


Probability: 0.1238, 0.4025, 0.2841, 0.0891, 0.0105: 1.0000 
Expected 
frequency : 32 60 43 13 2: 150 


3,5 POISSON DISTRIBUTION 

As has been seen earlier, the Binomial distribution is applicable in 
situations wherein a sample of specific or definite size and observe 
the number of times an event actually occurred as also the number of 
times it did not occur. Sometimes it may be possible to count as to 
the number of times an event occurred whereas the number of times 
it did not occur may neither be possible nor feasible and further if. 
‘n’ is large the Binomial probabilities involve laborious computa- 
tions the poisson distribution is used. If ‘n’ is large and p tends to 
zero, the Binomial distribution tends to what is known as Poisson 
distribution. 


Definition: The random variable Х is distributed as Poisson if the 
density is f(x) = e 35 х = 0, 1, 2, 3....where А is any positive 
integer. 

The probability function for the Sample Space is the set of non- 
negative integers S = (0, 1, 2...}. 


23 
C= om is said to obey Poisson probability law with para- 


meter ‘m’. 


The poisson distribution is a discrete distribution since 9? isa 
non-negative integer. 


By substitution of the values for x — 0:152 77 
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2: 0, 1 2, За. 
QAM м o лз 
fos ede eee. 
>= A ү ые AXE 
so that 3 fay =e (1 : i Ep |+ 3r* „је ећ = 2—1 


Hence this is a probability distribution. 


The Poisson distribution is a limiting form of the Binomial distri- 


bution as -»со, р->0 and пр is finite (say А). 
That is for any fixed x = 0, 1, 2...and A>0 


: хү х үгэ эх 

5 = —=е^.^ 

пи мо (5 (в еа 
5 A 3 
since np A, p= and aloe 


TH Speech (1 ED = —D-(n—x-10) 
xi 


n n* 


Since Lim ( 1— 2 Je — and Lim 
Ese n ARS тх 
Ах 
LHS. = e?.—, = R.H.S 
x! 


3.6 MEAN AND VARIANCE OF THE POISSON 
DISTRIBUTION 4 


Mean = E(X) = 5 PG) = E xen 
= | 


У ха x — нех 
SEX мг ине 
x=0 x(x — D^ xeo(X — 1)! 2 


zie [1 ЈЕ : ++ | 
=Ле^.-е+* = ме? = 
Hence the mean of the Poisson distribution is à 
Variance = E(X)* — [E(X)P 


оо Эх 22 эх 
2 — Мо 5 ^. 
E(X) E СЯ ACC Dx, 


n(n — 1)...(n — x + 1) 224 
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x13 
(= 0) (x —2ђ 


: y 
c о 


= eM [e ES 
x=0 


= @—®.)?.е^ + Àe.e-—2 + 3 
Hence variance = E(X?) —[E(X)P = 72 + a— (= 


Note that the mean and variance are equal in the case of Poisson 
distribution. 

Proceeding in the same pattern higher moments can also be ob- 
tained. p, = А and p, = 3X? + 2. 


Hence B, = 53; = = zand В, = P4 = =3- 


из M 1 Ва 3A? А 
мол Bas № 


The recursion formula for the Poisson distribution can be obtained 
(similar to the method adopted in Binomial distribution) 


х 


р(х) = гэ, x = 0, 1, 2, 3... 


~ op d) m e М 
@+D! 
л уа 
Hence p(x-bl) 5 "(xil юэ xt a 
Рх) № Єт) 


х! 


^ P+D = 


The following are some of the examples of random phenomena that 
follow the Poisson probability laws. 


3.7 SOLVED PROBLEMS 

1 б It is known that the probability that an item produced by a 
certain machine will be defective is 0.1. Find the probability that a 
sample of 10 items Selected at random from the output of the 
machine will contain not more than one defective item. 
Ax 


РС) еә 


P —————9À—— У xc. 


Са Bn — 
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Probability that an item produced is defective =0.1 — p 
пр = 10 (0.1) = 1 = ^ 
-. p (not more than one defective item) = 2-1. + e 
— 0.3679 4- 0.3679 
— 0.7358 
The solution can be obtained by applying the Binomial distribution 
also. 
Since p — 0.1, 4 — 1 —0.1 —0.9 
p,(not more than one defective) = 19co (0-1)°,(0.9)! + ус, (0.1)! (0.9)? 
= 0.7361 


2. It is known that bacteria of a certain type occur in water 
at the rate of 2 bacteriae per cubic centimetre of water. Assuming 
that this phenomenon follows a Poisson probability law, what is 
the probability that a sample of two cubic centimeters of water will 
contain (a) no bacteria (b) at least two bacteriae? 


à= (rate at which bacteria occur) (volume of sample water) 


=2х2=4 
jo 
Probability that there will be no bacteria == eso 
род) ! 
= e— = 0.0183 
Probability that there will be two or more bacteriae 


=1—[9(0) +001 = [+ ]-1- ao 
= 1 — 5e~ = 1 — 5 (0.0183) = 0.9085 


3. Ina certain published book of 280 pages 210 printing mistakes 
occur. What is the probability that 4 pages selected at random by 
the printer as illustrations of his work, will be free from errors? 

Let us assume that the printing mistake (printer’s mistake) follow 


Poisson law, the error rate being 216 = 2 per page 


Hence for 4 pages the errors = : (4) =3 
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The probability that there will be no errors in the 4 pages 


jo 
р(0) eO e= — 0,0498 


4. Ifthe probability that an individual suffers а bad reaction 
from injection of a given serum is 0.002, determine the probability 


that out of 1000 individuals (a) exactly 3 and (b) more than 2 indivi- 
duals will suffer a bad reaction. 


Here p = 0.002 and hence А = (1000) (.002) = 2 


(a) Probability (3 individuals suffer from bad reaction) = p(3) 


=з 0. SW 
= еа = (0-1353) | є | = 0.1804 
(b) p,-{more than 2 individuals suffer} 
=1 — {p,(0) + р,(1) - 5,2) 
22 
=1 – је 23024 єр 
1-6 (1--2-2|1-1-5ез 
1 — 5 (.1353) = 0.3235 
5. From the records an insurance company finds that the pro- 
babillty of death within one year is approximately 0.0005 for a given 


age group. For 1000 policy holders at that age, what is the pro- 
bability of no claims, of 1 claim, of two or more claims? 


p =0.0005 and hence A = (0.0005 (1000) = 0.5 


|| 


|| 


0 
р«(по claims) = e-9'5 E =e = 0.6065 


1 
p(l claim) = e% ша = (0.6065) (0.5)-- 0.30325 


р,(2 ог more claims) = 1 —[p(0) + p(1)] 
= 1 — [0.6065 + 0.30325] 
= 1 — .90975 = 0.09025 


6. In a Poisson distribution if р(Х = 2) = р(Х = 3) find 
P(X = 4) (П PUC—A pril 83) 
Since 22 


р(Х= ху = e 


——— umma m mm c udi cem a а 
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2 


px —2) e 


2! 
33 
and p(X = 3) = e? 
Since p(X = 2) =р(Х = 3) 
2 3 
ёс 3i = eu i.e. 6A? = 238 
or 23 = 32? 
ELE 


So that p(X = 4) ==. № = гэ 2 = (0.0498) 51 = 0.1681 
4i лү red ул 


38 FITTING OF THE POISSON DISTRIBUTION 

It has been observed that on the basis of the values of ‘n’ and the 
probability, the mean 9? for the Poisson distribution can be obtain- 
ed easily, whereas in certain other situations the value of ‘A’ has to 
be obtained experimentally by equating the mean of the sample 
distribution to the mean of the Poisson distribution. 


[llustration: The amount of dust іп the atmosphere may be esti- 
mated using an ultra-microscope, wherein a very small volume of 
air is illuminated by a spark and the observer counts the number 
of particles of dust he sees. By repeating this operation a large 
number of times, the amount of dust in each cubic centimetre of 
air can be estimated. Suppose that the following test results were 
obtained in a series of 300 spot checks by the flash method. Calcu- 
late the expected frequencies for each number of particles for com- 
parison with the observed frequencies shown in the table. 


Number of particles : 0 “it 2 3 4 Sy Gy 77 
Frequency of occurrence: 38 75 89:4 58. 20) 19 


Па св моји ї/хо 625 
Mean of the ob d distrib === === 
ean e observed distribution 57 — 300 2.0833 
this is equated to à, the mean of the poisson distribution 
yd 


рК = == 


72 PRE-UNIVERSITY STATISTICS-IL 


x p(X =x) Probability- Expected frequency 
а с =е-т%®зз —0.1246 0.1246 х 300--38 
1 eren 20883) —02596 | 02596x300—78 
2 еза c —0.2704 0.2704 300—81 
3 ees (2.0883)? =0.1878 — 0.1878x300—56 
4 ems 20883 —0.0086  00986х300--30 


5 
5 ees 0.0883) =0.0407 01040730012 


6 
6 e= oss, (2.0883) =0.0141 0.0141x300— 4 
2.0883)* 
7 em 8 0883) —0.0042 0.0042 х300= 1 
Total 1.0000 300 
Number of particles ТОП 203 Mea Total 


(Sy =) 
Frequency of occurrence: 38 75 89 54 20 19 0 5: 300 
Expected frequency 3738/5 78) 817 56 330 412 eed) 1 


Another illustration: Bortkiewicz studied the number of men 
killed by horse kicks in 10 Prussian army corps over a period of 20 
years and found the following frequencies. 


Number of deaths рег corps year: 0 1 2 3 4 Total 
Frequency = 109. 565.222 «03 131229200 


Compute the mean and variance and show that these are nearly 
equal. Fit a Poisson distribution to the data 
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x £ bs Sx 
0 109 — = 
1 65 65 65 
2 22 44 88 
3 3 9 27 
4 1 4 16 
Total 200 122 196 
_ Sfx 9122 
Mean = Syn 200 = 0.61 
2 2 
) Variance = i - (37) 
196 


= 500 — (0.6D* = 0.98 — 0.3721 = 0.6079 


Hence Mean ~ Variance. 


Since = 0.61 


p(0) = с 


0 
шил = e6 = 0.5434 


using recursion formula p(1), FU be obtained 


к + D =PO тА 
p(1) = p(0)-—— о. ee = 0.3315 
РО) = pq. 9: 270 = 0.1011 


РВ) = мэ — 0.0206 
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(0.61) 


p(4) =р(З)-—у— = 0.0034 


The expected frequencies are 


x Prabability Expected frequency Observed frequency 
0 0.5434 109 109 
1 0.3315 66 65 
2 0.1011 20 22 
3 0.0206 4 3 
4 0.0034 1 
Total: 1.0000 200 200 


3.9 EXERCISES 


17 


Define а Binomial variate, write down the probability function 
of a Binomial distribution. 


. If ‘p’ is the constant probability of ‘и’ independent trials of an 


experiment, show that the probability of getting exactly x 
successes is 


р(х) = лехр“ 4" > where q = 1 —p (II PUC—A pr. 1978) 


. Write down the probability function of a Binomial distribution 


and give two examples for the same. (II PUC—Apr. 1975) 


. Obtain the mean and variance of a Binomial distribution. 


(IL PUC—Mar. 1977) 


. Find the mean of a Binomial variable. (II PUO—A pr. 1983) 
. If f(x) represents the probability function of a Binomial variate 


with parameters n and p, write down the expression for f(x). 
Show that 5/(х) = 1. Under what conditions the distribution 
tends to Poisson distribution? 


(Арг. 1983-П РОО (Old Scheme)) 


= о 


10. 


WE 


13. 


14. 


16. 
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. Describe the Binomial distribution and mention its important 


properties. 


. Suppose that two brands of aspirin are equally effective in 


reducing pain. What is the probability that if 10 individuals 
who need pain relief are asked to try both brands, at least 8 
of them choose Brand A? 


. А restaurant can accommodate 50 customers. Experience indi- 


cates that 10% of those who make reservation will not turn up. 
Suppose that the restaurant accepts 55 reservations. Let x 
denote the number of customers who turn up. Find an expres- 
sion for p(x) which gives the distribution of *x'. Compute the 
mean and variance. 


If X' denotes the number of heads in a single toss of 4 fair coins 
find (а) p,(X < 2) (b) рех < 2} and (c) p(1 < X < 3). 

Out of 800 families with 5 children each, how many would you 
expect to have (a) 3 boys (b) 5 girls (c) either 2 or 3 boys? 
Assume equal probabilities for boys and girls. 


. Find ‘р’ for a Binomial random variable “А? if 


п = 6 and 9p(X = 4} = p(X = 2} 


It was found that 10% of girls in a certain class had short 
sight. What is the probability that a random sample of 5 girls 
will contain (a) no girl suffering from short sight (b) exactly 
one girl suffering from short sight (c) not more than 4 girls 
suffering from short sight. 


A door-to-door salesman sells 3 sizes of brushes, which he calls 
large, extra large and giant. He estimates among the person he 
call upon the probabilities are 0.4 that he will make no sale, 
0.3 that he will sell a large brush, 0.1 that he will sell an extra 
large brush and 0.2 that he will sell a giant brush. Find the 
probability that in 4 calls he will sell (a) no brushes (b) 4 large 
brushes (c) at least 1 brush of each kind. 


. The probability of a bomb hitting the bridge is 1/5. If 6 bombs 


are thrown at the bridge, find the probability that 2 bombs hit 
the bridge. (II PUO-March 1974) 
Fit a Binomial distribution to the given distribution; 
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197; 


18. 


19. 


20. 
21. 


22. 


23. 


24. 


25. 


26. 


21. 


zug 0 1 2 3 4 
12329 4 30 36 55 5 


Ten cards were drawn from a pack of cards, oneata time with 
replacement after each draw and noted the number of black 
cards *x' and this procedure was repeated 1000 times and the 
following frequency distribution was obtained 


ESO 5382093 “Ade Si сота 8944 10. Total 
f : 3 10 43116 221 247 202 115 34 9 0 1000 


Compute the theoretical frequencies by taking p = 3. 
Define a Poisson variate. Give two examples of Poisson distri- 
bution. (П PUC-Oct 1977) 
Write down the first six terms of a Poisson distribution with 
mean ‘m. (II PUC-Oct 1974) 
Describe briefly a Poisson distribution. (II PUC-March 1974) 
Prove that the mean and variance of Poisson distribution are 
equal. (II PUC-March 1976) 
In a Poisson distribution, the probability of “0” successesis 10%. 
Find the mean of the distribution. (П PUC-April 1983) 
Let *X be Poisson variate such that P (X—3)—P(X—4). Find 
p(X21). (II PUC-April 1983 (O.S)) 
Prove that the mean of the Poisson distribution is A. 

(II PUC-A pr 1984) 
Suppose that a certain digital computer which operates 24 
hours a day, suffers breakdown at the rate of 0.25 per hour. 
It is observed that by ‘A’ that the computer has performed 
satisfactorily for 2 hours. What is the probability that the 
machine will not fail within the next 2 hours? 
Find the probability that no defective fuse will be found in a 
box of 200 fuses if experience shows that 2% of such fuses are 
defective. 
Assume that the chance of an individual coal miner being killed 


in a mine accident during a year is a Use the Poisson law 


to calculate the probability that in a mine employing 350 miners 
there will be at least one fatal accident in a year. 


CA 


28. 


29: 


30. 


ЗГ, 


32. 


33: 


34. 
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Experience of a certain disease indicates that it has a fatality 
гаїс of 10%. A new treatment tried out on 30 patients results 
in 7 deaths. Is the evidence sufficiently strong to show that 
this treatment is inimical to the best interests of the patients? 
The average number of deaths due to road accidents in a certain 
State was 3.00 per 100,000 population in a year. Find the pro- 
bability that in a particular city of the State with a population 
of 200,000 there will be, (i) 0 (ii) 2 (iii) 6 and (iv) between 4 
to 8 accidental deaths per year. 

The probability that a man aged 35 will die before reaching 
the age of 40 may be taken as 0.018. Out of a group of 40 men, 
now aged 35, what is the probability that ‘x’ will die within 
the next 5 years? Draw a table of the probabilities for different 
values of x. 

Fit a Poisson distribution to the following data: 


X: 0 1 2 3 4 5) 


л 22 13 5 5 3 2 
(II PUC—Apr. 1984) 


Fit a Poisson distribution to the following data: 


x 0 1 2 3 4 5 
ifs 142 156 69 27 5 1 


The following data gives the excessive rainstorms during a 
period of 33 years: 


Number of excessive rain storms: 0 1" 2,-3 71330413 
Frequency :102 116 74 28 10 2 


Find the frequencies of the Poisson distribution which has the 
same mean as this distribution. 
Certain manufactured items are packed in cartons of 100 articles. 


0.47 per cent are said to be defective articles. What proportion 
of articles are free from defects? 


7. Normal Distribution 


7.1. The Binomial and Poisson distributions considered earlier are 
probability distributions for discrete variables. Let us consider pro- 
bability distribution for continuous variables. 

A random variable *Y' is said to be normally distributed if its 
density function is given by 


wid (х0)? 


The normal distribution has two parameters и and c and 


22 1 1 1 
х= = ande = 1 + пт 517731 +...= 2.7182... 


The normal distribution is a limiting form of the Binomial distribu- 
tion as п — со and p and 4 not being very small. Since the normal 
distribution depends on the parameters y. and с (в > 0), for various 
values of и and c different normal curves can be obtained. 


7.2. PROPERTIES OF THE NORMAL DISTRIBUTION 

1. The normal distribution is symmetrical with the greatest 
frequency at the centre, the frequencies fall off to exceedingly smaller 
values at any considerable distance from the centre. (x — p) is the 
distance of the observation x from the centre of the distribution p. 
The shape of the curve is much like that of a bell. 

2. The total area under the curve is 1. The area bounded by the 


ES 
curve f(x) — = exp te = ) ) is equal to one, so that the area 


under the curve between two ordinates Х = a and Х = b, where 
а < b, shows the probability that Х lies between a and b 
(рг{а < X < b) 

3. The mean, median and mode coincide. 

4. The coefficient of skewness = 0 and В, = 3, and the curve 15 
symmetrical about the mean. 


5. Mean deviation = o £- 0.79790. 


PaA e 
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6. 68.27% of the observations fall between Mean — 1 S.D to 
Mean + 1 S.D. 


95.45% of the observations fall between Mean — 2 S.D to Mean 
+ 25.0. 


and 99.73% of the observations fall between Mean — 3S.D to Mean 
+3 S.D. 


In other words 
p.p о < X <p + o) = 0.6827 
ріш 2e < X < и + 20) = 0.9545 
pip — 3o < X < џ + 36} = 0.9973 
when p = 0 and с = 1, the normal curve is called a standard 
normal curve. In other words when X is expressed in terms of say 
22 К 2 О саа 
2 = сш) the distribution is obtained as f(Z) = —-—e- 2 which 
с 2x 


is called a standard form and z is normally distributed with mean 
zero and variance unity. 

Consider the following frequency distribution which gives the 
heights of 1000 adult men. 


Height in 
inches : 61-, 62-, 63-, 64-, 65-, 66-, 67-, 68-, 69-, 70-, 71- 


Number of 
men of 
Biven height: 2 5 17 43 86 152 193 197 148 91 45 


Height : 72-, 73-, 74- Total 
Number 27116 4 1 : 1000 


The mean of this distribution — 685—205 — 67.909 or 68" 
4275 501 V? 

and the S.D. = z 
1000 ira == 


The histogram of the frequency distribution are plotted. 
Note that the curve is bell shaped and almost symmetrical. 
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frequency 


«v &à 8 & 6 & € & 6 X n 2» n * ® 
x-variate I ! ! } ! ! \ 
В ! М 
height in сћеб — —— нн | | : 
! Ї ! | ! 0 
| ! LMANF'SD | ! i 
' ! i 
1 ! Џ 
| UEM MEAN PSU OE J ' 
Я | 
И = БАН а 50 АЧ 2 
z-vanate -3S0 -250 -ISO 0 +150 +250 4350 
Fig. 12 


Further, the median and mode for the distribution are 68701 and 
68".1 respectively so that mean = median = mode approximately. 
Further, let us examine yet another property of the normal dis- 
tribution, as to how many men have heights between Mean +1 
S.D etc. 
Mean + 15.2 = 68' + 2” = 66" and 70" Mean +2 S.D 
= 68" +4" = 64" to 72'and Mean + 3S.D = 68° + 6" = 62" 
and 74" 
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Height in Inches Number of Men 

with said height 
61—62 2 
—3 S.D. 62—63 5 
63—64 17 
—2 S.D. 64—65 43 
65—66 86 

| 
—1 S.D. 66—67 152 
67—68 193 
690 | 955 997 

68—69 197 
69—70--1.S.D. 148 
70—71 91 
71—724-2.S.D. 45 
72—13 16 
73—74-3 S.D. 4 
74—75 1 
Total 1000 


Thus itcan be seen that 690 men out of 1000 possess the heights 
between mean --1 S.D. In other words 69 % of the men do not ‘aes 
stature less than (mean — 1 S.D.) and more than (mean +1 S.D 

Similarly 955 out of 1000 or 95.5% of the men have the ien 
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between (mean—2 S.D.) and (mean + 2 S.D.) and finally 997 out of 
1000 or 99.775 have the heights between (mean —3 S.D.) and (mean 
+ 3 S.D.). 

It has been made clear earlier that the chief characteristics of the 
normal distribution are that the (i) curve is symmetrical around the 
mean (ii) the mean, median and mode coincide (iii) 68.27% of the 
observations lie between mean +1 S.D. 95.45 % of the observations 
lie between mean +2 S.D. and 99.73% of the observations lie be- 
tween mean 4-3 S.D. 

Hence the frequency distribution of the heights of 1000 men 
almost nearly fulfills the ideal normal distridution which in real 
practice is a very rare phenomenon to obtain. Since this is only a 
hypothetical example the distribution appears to be nearly normal 
To overcome the difficulty of projecting such distributions, the 
standard normal curve is used in practice wherein the mean — 0 and 


S.D.—1 and 2-5 


is denoted by ‘Z’ and the equation of the 
normal curve is 


Л) = eet о сао 


The table values of f(z) for positive values of z is given in the 
table (Annexure-II). The purpose of writing the equation in stan- 
dard form is that the areas may be tabulated. Thus the areas under 
the standard normal curve from Z — 0 to selected positive values 
of Z are provided in the table. From the table it can be seen that 
the area from Z = 0 to Z = 0.5is 0.1915. As the curve is symmetrical 
the area from — 0.5 to 0 is the same as the area from 0 to 0.5. In 
the same method any other required area can be determined using 
the table values, of course by the prior knowledge that the whole 
area of the curve from — оо to + oo being 1. 


7.3 ILLUSTRATIVE EXAMPLES 
Assuming that height of university students to be a normal variable 
(say x), with mean 68” and S.D. 2", what percentage of such stu- 
dents are taller than 677 

Since x = 6' = 72", we have to obtain p (x > 72") 


E] —-— > 


——=-- = - 


~ 


P" + 
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From the table, the area under the standard normal curve between 
Z —0 to Z = 2 is 0.4772. Since the area under the right half of the 
curve is 0.5000 


p(Z> 2) = 0.5000 — 0.4772 = 0.0228 


2.28% of the students will have the heights exceeding 6’. 

2. The mean of a normal distribution is 50 and the variance is 
100. Find the probability that the value of the random variable 
selected at random will be (i) less than 45 (ii) between 45 and 64 
(iii) more than 64. (II PUC-Oct 1978) 


Given y. = 50 and о = 4/100 = 10 


The S.N.V. 2= 226 
45 — 50 
10 

p (Z < —0.5) —0.5— 0.1915 = 0.3085 
(ii) Between 45 and 64 


= — 0.5 


(i) Z= 


45 — 50 
Ту---үрээ-05 
64 — 50 
Z= o =14 
p, (45 < X < 64) = p, {— 0.5 < Z < 1.4} = 0.1915 + 0.4192 


= 0.6107 
(ii) p (X > 64) =p, (Z > 1 5) = 0.5000 — 0.4192 = 0.0808 
3. The height distribution of 10,000 adult males is found to 
follow normal distribution with mean 165 cm and standard devia- 
tion 4cm. Find the number of adult males with height (i) below 
159 cm (ii)ubove 172 cm (iii) between 156 cms and 174 cms. 
(П PUC-Apr. 1983) 
=w 159—1651 
с 4 
PAX < 159) = p,(Z < — 1.5) = 0.5000 — 0.4332 = .0668 
The number of adult males with height less than 
159 cm = 0.0668 x 10,000 = 668 


d) 2-2 1.5 
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(ii) Above 172 cm 


1.75 


РАХ > 172} = р(7 > 1.75) since Z = шин = 


= 0.5000 — 0.4599 = 0.0401 
401 adult men will have height above 172 cm. 
(iii) Between 156 and 174 cm 
_ 156 — 165 


= —À(— =—225 
z= 168 — 2 
Thus р{156 < X < 174} = p{— 2.25 < Z < 225) 


— 0.4878 - 0.4878 = 0.9756 


i.e. 9756 adult men will have heights between 156 cm and 178 cm. 
4. IfZ is normally distributed with mean 0 and variance 1 find 


(a) 42 > — 1.64} (b) р{— 1.96 < Z < 1.96} (©) p, {| Z | > 1) 
Mean = 0 and S.D. = 1 


D reed ME 
с 


(а) DAZ > — 1.64} = 0.5000 + 0.4495 = 0.9495 
(b) p,(— 1.96 < 2 < 1.96) = 0.4750 + 0.4750 = 09500 
(c) pÁ|Z| > D =1- 0.3412 = 0.6587 


5. И the weights of 300 teachers are normally distributed with 
mean 68.0 kg and standard deviation 3 kg how many teachers will 
have weights (a) greater than 72.5 kg, (b) less than or equal to 64 kg, 
(e) between 65 and 71 kg inclusive (d) equal to 68 kg? 


(a) р(х > 72.5) = р(2 > 1.50) 
72.5 — 68 45 


= 0.5 — 0.4332 = 0.0668 
0.0668 x 300 or 20 teachers will have weights greater than 72.5 kgs. 
(b) less than or equal to 64 kg 


64 — 68 
2 = 3 


= — 1.33 
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p(Z < — 1.33) = 0.5 — 0.4082 = 0.0918 
28 teachers will have weights less than or equal to 64 kgs. 
(c) between 65 and 71 kgs inclusive 


1 3 


1 3 
p{—1 < Z < 1} = 0.3413 + 0.3413 = 0.6826 
205 teachers will have weights between 65 and 71 kgs. 


(d) « 64 kg — 28 
between 65 and 71 — 205 
above 72 = 20 
253 


Hence 47 (300 — 253) will have weight equal to 68 kgs. 
6. For aset of 1000 items known to be normally distributed, 
the mean is 534 cm and S.D. is 13.5 cm. 
() How many items are likely to exceed 561 cm? 
(ii) How many will be between 520.5 cm. and 547.5 cm? 
(II PUO—Apr. 1984) 


С x—y _ 561 — 534 27 

© 2 Su 51315 13.5 
PAX > 561) = p(Z > 2) = 0.5000 — 0.4772 ={0.0228 
0.0228 х 1000 = 22.8 or 23 items are likely to exceed 561 cm 


1 520.5 — 534 
(ii) Z = Б и 1 


-2 


547.5 — 534 
ДЭ =1 


13.5 
р{520.5 < X < 541.5} = p}— 1 < Z < 1 
= 0.3413 + 0.3413 = 0.6826 


or 683 of the items will be between 520.5 cm and 547.5 cm. 
7. Given that the height of college boys is normally distributed 
with mean 62" and S.D. 4' and that the minimum height required 
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for joining the N.C.C. is 64", find the percentage of boys who would 
be rejected on account of their height? 
z-X-e. 64 — 62 
d NOE ИД 
PAX > 64") = p(Z > 0.5) = 0.5000 — 0.1915 = 0.3085 
РКХ < 64") = 1 — 0.3085 = 0.6915 


69.15% of the boys would be rejected on account of their heights. 


8. What is the probability of a value falling between Z — — 0.75 
and Z — 1.25? 


= 0.5 


From the tables 


2 area) to Z 
0.75 0.2734 
1.25 0.3944 
0.6678 
Hence р4— 0.75 3 Z « 1.25) — 0.6678 


9. Find the specified proportion of the following normal distri- 
butions: 


(а) р = 25, c = 10, area between p and 30 
(b) u = 25,6 = 25, area between 20 and 30 
(c) и = 0.10, = 0.02, area between 0.07 and 0.13 


e- - — э —- 
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(d) џ = 5, в = 20, area between 0 and 25 
(е) в = 100,000, с = 2500, area between 95,000 and 99,000 


(а) ДЕ = 0.5 


(b) Z= 


p,(— 0.20 < Z < 0.20} = 0.0793 + 0.0793 = 0.1586 
(с) œ = 0.10, с = 0.02 
х— и 0.07—0.10 _ 
678 LONE 
0.13 — 0.10 
0.02 


— 1.5 


2, 


Z- = +155 


87 


рд 1.5 < Z < 1.5} = 0.4332 + 0.4332 = 0.8664 


(d) и = 5 апіс = 20 
25—5 _ 


pí— 0.25 < Z < 1.0) = 0.0987 + 0.3413 
= 0.4400 
(е) ш = 100,000, с = 2500 
95000 — 100000 _ 
As 2500 res 


2 99000 — 100000 
ын 25000 


= — 0.04 


2 Area between 
002 


0.0160 
2.0 0.4772 
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-L = — LL 12.) 
92500 95000 9700 10000 02500 5000 WISN 
! ! 


x values: | | 
-3 2 2 | ° 1 2 3 
zvaues: | 10:08 
Fig. 14. 
Area between Z = —2 to Z = — 0.04 
is 0.4772 — 0,0160 = 0.3218 


р{— 2 < Z < — 0.04) = 0.4612 
10. The life of a light bulb is normally distributed with an aver- 
age (mean) life of 100 hours and variance of 36, 
a. What percent will last more than 110 hours? 
b. What percent will last between 85 and 95 hours? 
c. 15% will born out before what length of time? 
(а) More than 110 hours 
x = 110 hours, » = 100 hours, с = 4/36 — 6 


z = WO _ 1.67 


PAZ > 1.67} = 0.5000 — 0.4525 = 0.0475 
Hence 4.75% will last for more than 110 hours. 
(b) between 85 and 95 hours 
2,5 =. ——25 


2, =P = 03 


PA— 2.50 < Z < — 0.83} = (area between 0 and 2.5 
— area between 0 and 0.83} 
= {0.4938 — 0.2967} = 0.1971 


NORMAL DISTRIBUTION 89 


ie. 19.71% will last between 85 and 95 hours. 

(c) 15% will burn out before what length of time? 

It is necessary to determine the Z value for which 15% will burn 
out (85% will not burn out). 

15% will burn out amounts to 85% not burnouts. As the entire 
upper area of 50% are included along with the lower end of 35% 
which makes up the difference, the Z for 0.3500 from the table is 
approximately equal to 1.033 which is —1.033 


Since gj itae 


= p + o Z = 100 + 6 (— 1.033) 
= 100 — 6.198 = 93.802 


E 
| 


7.4. FITTING OF DATA BY NORMAL DISTRIBUTION 
Illustration: The table below shows the distribution of the maxi- 
mum wads in kilonewtons supported by certain cables produced 
by а company. 


maximum load : 93-97, 98-102, 103-107, 108-112, 
(KN) 


number of cables: 2 5 12 17 
maximum load : 113-117, 118-122, 123-127, 128-132: Total 
number of cables: 14 6 3 qs 160 


Fit a normal distribution. 
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Solution: The initial step is the determining of mean and S.D. for 
the frequency distribution. 


м f - m 3 ШО, iy эн 
9397 2 95 We ye 5 
98- 5 100 ARIES 25 

103- 12 105 LN eel it 

108- 17 110 0 E KE 

113- 14 115 1 ie т 

118- 6 120 2 12 ae 

123- 3 125 3 * 

128-132 1 130 4 4 16 
Total 60 m 131 


Меап = 110 + 5 x 5 = 110.92 


S.D. = oum (5) — 5.7945 


Col. (4) values are obtained as under: 
Since Y = 110.92 and в = 5.79 


Z for class boundary say 92.5 — 92.5 — 110.92 


5.79 пе 
Z for class boundary say 97.5 etc. = PER = — 2,32 


Z for class boundary say 132.5 = B QU 243073 
Column (5): Area from 0 to Z are obtained by referring to the tables. 
Thus the area between 0 and —3.18 is 0.4993 
Thus the area between 0 and —2.32 is 0.4898 etc, 
Thus the area between 0 and 3.73 is 0.4999 
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09 ТОД, 09 12301, 
0 10©го 020070 66670 | LET [4431 1 ©ЄТ-8©1 
I JopcT LOCO'O 6/6у'0 98`©-- S'LTI Е LTI-ETI 
9 10909 (4180 (224) 00+ 7201 9 СС1-811 
9] 20 66°ST 6990 601670 HUIF S'LIT РТ ЕП 
0c 10 6//61 882670 эц 0 0+ STIL LI TII-801 
СІ 01 19020 TCT" 65'0— 2101 eI 101-601 
Р 10085 869070 S9cr'0 SPI— €'col 5 201-86 
I 10160 060070 868970 (doti 8716 < 16-66 
866870 81: 6-5 S'T6 
[2] (9) (9 G) (€) x» (0) (D 
| E Ss 
| 821492 
zorgwof | fo saqunu 
(ouanbaaf sspja 4209 244n9 рилои Salsppunog | ѕәјаррипод |(Couanbo4f | (ММ) pvo] 
pajoadxq 40f asp dapun рәлр | 55012 40/ Z 55012) р2442540) шпипхруј 
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Column 6: The column (5) values are the areas from 0 to class 
boundaries and hence the area for each class intervals are obtained 
as shown below: 

Area from 0 to class boundary 92.5 — 0.4993 


Area from 0 to class boundary 97.5 = 0.4898 


Thus the area for the class 93 to 97 — 0.0095 


Column uumber 7 are obtained by multiplying the values in column 
(6) by 60, the total frequency. Thus the expected frequencies for 
various classes are 1, 4, 11 ,20, 16, 6, 1, 0 respectively. 


7.5 EXERCISES 


1. Wiritea brief note on the normal distribution explaining its 
role in Statistics. 

2. What properties of the normal distribution are used for judg- 
ing the normality of a given frequency distribution? 

3. Wirite down any four properties of a normal distribution? 

(PUC—A pr. 1983). 

4. Enumerate the various properties of a normal distribution. 

5. Definea normal variate. Write down the probability function 
of a normal distribution whose (i) mean is zero and variance is 
unity (8) mean is 22 and variance is 64. (II PUC—Oct. 1978) 

6. Write down the probability function of a normal variate, 
Explain any four properties of normal distribution. 

(II PUC—Apr. 1978) 

7. What is meant by Standardised Normal Variate? State its 
importance. (П PUC—Mar. 1977) 

8. Define a Standard Normal Variate. Write down the probability 
function of a normal distribution whose mean is 2 and variance 
3. (II PUC—Apr. 1975) 

9. Find the probability of a value falling in the following portions 
of a normal distribution. 

(a) between Z = — 0.42 and Z =0.42 


(b) between Z= — 0.61 апі Z=1.35 


10. 


14. 


15. 


16. 
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(c) between Z — — 0.87 and Z — 0.65 
(d) beyond 2--0.15 

(c) smaller than Z = — 0.50 

(f) greater than Z — 1.33 

(в) lesthan Z=1.73 

(b) greater than Z = 0.00 


The weights of 1000 students are found to be normally distri- 
buted with mean 40 kgs and standard deviation 4 kgs. Find the 
number of students with weights: (i) Less than 50 kgs (ii) Bet- 
ween 40 kgs and 45 kgs. (II PUC—Apr. 1984) 


. The electric bill of a house have averaged Rs. 20 a month with 


a variance of 9. What is the chance that the bill might be more 
than Rs. 24. Ninety five percent of the time how much will be 
the bill? 


. *x' is given to be normally distributed with mean 10 and S.D. 


5. A sample of 1000 values are drawn. Find (i how many 
values exceed 15 (ii) how many values are contained between 
18 and 21 (iii) How many values are less than 4 ? 


. The heights of students of a college follow a normal distribution 


with mean 60” and S.D.3". If there are 1000 students in the 
college, find the number students having height (i) between 58" 
and 66" (ii) more than 65". (II PUC—Apr. 1978) 
The mean of a normal distribution is 25 and standard devia- 
tion is 5. Find the probability that a value of the variable 
selected at random will be (1) less than 35 and (ii) between 20 
and 35. (II PUC—Oct. 1976) 
A variate is normally distributed with mean 20 and variance 16. 
Find the probability that its value lies between 14 and 28. 

(II PUO—Oct. 1975) 
Та a certain city the electricity board install 2000 new electric 
lamps having an average life of 1000 burning hours with a 
S.D. of 200 hours. (i) what proportion of the lamps might be 
expected to fail in the first 700 burning hours? (ii) what number 
of lamps may be expected to fail between 900 and 1300 burn- 
ing hours? (iii) After what period of burning hours would we 
expect that 10% of the lamps would have failed? 
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17. 


The mean I.Q. of a large number of children of age 14 was 


100 with a S.D. of 16. Assuming the distribution to be normal 
find (i) what percentage of the children had I.Q. under 80? 
(ii) what percentage of the children had I.Q.'s within the range 


р 1.960? 


18. Fit a normal distribution to the following data, which shows 
the distribution of the diameters of the heads of rivets manu- 


factured by a company. 


Diameter 

(mm) 
Frequency: 2, 6, 
Diameter: 7.265-, 7.268-, 
Frequency: 49, 25, 
Diameter: 7.274-, 7.277-, 
Frequency: 12 4, 


: 7.247-7.249, 7.250-, 7,253-, 7.256-, 7.259-, 7.262-, 


Bye 515712 42x14 68, 
7.271- 
18, 
7.280-7.282 Total 
1 250 


19. Fit a normal distribution to the following data: 


marks 


frequency: 1 5 12 


80-89 
frequency: 5 


Total 
130 


marks 


0-9, 10-19, 20-29, 30-39, 40-49, 50-59, 60-69, 70-79, 
25 


37 24 13 8 


4. Large Sample Tests 


4.4. TESTS OF HYPOTHESES 
Two chief areas of statistical inference are, the estimation of para- 
meters and the testing of hypotheses. In many experimental research, 
the objective may be merely of estimation of parameters with the 
help of, and the information from the sample. Thus one may wish 
to estimate the population of a city on the basis of a Sample 
Survey. Though the ultimate purpose is of estimation, it leads to 
know something about the estimates, as one may wish to compare 
the population estimates obtained with respect to the earlier or 
improved methods adopted. Hence the question of decision making 
arises. Decisions are a necessity. It may be a decision asto what type 
of dress to wear, what course and college to pursue studies, what type 
of books (comics, horror stories, only text books) to read, for im- 
proving knowledge. For all such decisions data have to be collected. 
Before deciding the dress, we review the existing pattern all around 
and decide the one which fits in best. Assessment is made on the 
basis of marks obtained in the X standard in say, mathematics and 
science before planning for the course or the college to pursue 
further. On the basis of the individual interest coupled with the 
number of books borrowed say in a central library or circulating 
library of a particular author and type of books, decisions are made. 
In almost all realistic circumstances where decisions are made 
involve risk. Hence from the point of view of tests of hypotheses 
where one model or the other offers a better explanation of the 
observed data in decision making. In most of the situations the 
purpose of gathering and analysing data is for decision on a course 
of action. It may be to find out whether method А is superior to 
method B of preserving food insofar as retention of vitamins 
are concerned. À new rice variety PUSA-205 which yields 6.5 
tonnes per hectare as against the present average of 1.4 tonnes per 
hectare has been evolved by the Indian Agricultural Institute, New 
Delhi. To decide about the new variety, the probability distribution 
is to be known and since it is not easy to obtain for the population, 
Sampling is resorted to and choice between actions 4 or B namely 
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the new variety or the conventional one are decided on the basis of 
samples. 

A Statistical hypothesis is a statement concerning the probability 
distribution of a random variable. 1t is an assertion or conjecture 
about the distribution of one or more random variables. If the 
statistical hypothesis completely specifies the distribution, it is 
known as a simple hypothesis, otherwise itis known asa composite 
hypothesis. Hence a simple hypothesis not only specifies the func- 
tional form of the underlying distribution, but the parameters are 
also indicated. Consider the following examples to decide in each 
case whether the given hypothesis is simple or composite. 


() The hypothesis that a random Variable has a Poisson distri- 
bution with А = 1.35. 

(1) The hypothesis that a random variable has а Poisson distri- 
bution with à > 1.35, 

Gii) The hypothesis that random variable has a normal distribu- 
tion with mean — 95, 


The first example is simple hypothesis since the distribution is 
specified and sample size can be assumed. The second and third 
examples are clearly composite hypotheses since in the former 
à > 1.35 does not specify and assign specific values for à and the 
later too is composite since the sample size is not specified and the 
other parameter c is not indicated. 

It is observed that BCG vaccination gives immunity against tuber- 
culosis. To test this it is essential to find out the proportion of 
individuals attacked by tuberculosis amongst those who were vacci- 
nated as also from those who were not vaccinated and from these 
observed proportions the testing of hypothesis regarding the efficacy 
of vaccination is an off shoot of the problem. 

In a KAP study (Knowledge, Attitude and Practice) on рорша- 
tion education for high school children, 222 high school teachers 
were interviewed of which 110 were imparting ‘Population education’ 
to the school children as a part of social science or otherwise. 
Amongst the 110 who were teaching this subject 80 were trained 
teachers, whereas amongst the 112 who were not teaching this sub- 
ject 62 are trained. These observed values enables to test the hypo- 
thesis regarding efficacy of training for imparting population educa- 
tion in high school level. 
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The TV engineer has to decide on the basis of the Sample data 
whether the true average life time of a TV tube is atleast 500 hours; 
A popular moped claiming that the petrol consumption is less and 
that it gives a mileage ofatleast 68 km,/litre. Naturally the decision 
regarding the hypothesis is either to accept the hypothesis that 
ба = ua OF reject the hypothesis p, 5 us. The former is known as 
the null hypothesis and the later is called the afternate hypothesis. 


4.2. A NULL HYPOTHESIS 

It is proposed, when it specifies for a model a parameter, such that 
the probability can be computed for each and every Sample point 
and is denoted by H, (Thus for a binomial model Но:р = ро) An 
alternative hypothesis is one such that it can be concluded in its 
favour if the Sample evidence does not support the null hypothesis. 
For the binomial distribution H,:p Æ p, ie. Њ:р <p or 
Hyp < po. 

Hence the null hypothesis is always one for which it is possible 
to compute probabilities of outcomes in the Sample Space, which 
has meaning in the decision rule. It is usually the hypothesis of no 
difference between or among treatments whereas an alternate hypo- 
thesis is usually a set of alternatives, that a parameter is different 
from that specified by the null hypothesis. For a given null hypo- 
thesis, the decision has to be made as to which Sample outcomes 
tend to support it and which tend to deny it. If about half the 100 
metre runners in a specified trainlng programme qualify and the 
remaining half lose, then it is agreed that there is little evidence to 
deny H,:p — 0.50 ie. the programme is of equal value where 
qualifying mark is the criterion. On the other hand if most of the 
runners on one programme run faster than their matched pair 
mate, then the null hypothesis is not supported and decide on 
H,:p = 0.50. The problem of hypothesis is that the decision has 
to be made whether or not the hypothesis that has been formulated 
is correct or not, which ultimately result in two decisions namely 
accepting or rejecting the hypothesis. A decision. method of such 
a problem is called test of the hypothesis in question. 

Usually statement of hypotheses are made opposite to what is 
believed to be true, like for instance to show students of one class 
have a higher average marks than those of another class, the hypo- 
thesis formulated is that there is no difference in the average marks 
between the classes, that is ш = vs. Similarly to show that one 
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brand of cigarrete has a higher percentage of tar content compared 
to the other, the hypothesis formulated could be H, : p, = pz, that 
is the two percentages are the same. Since the assumptions are 
usually of no difference, such a hypothesis is known as Null hypo- 
thesis. 

Testing of Statistical hypothesis is the application of a clear set 
of rules to decide whether to accept or reject the null hypothesis 
in favour of the alternative hypothesis. If it is desired to test the 
null hypothesis Hy: = р, against the alternative hypothesis 
H, : = p, and to enable to make a choice by conducting an ex- 
periment and generating Sample data from which the value of the 


test statistic ų сап be computed whereby possible outcomes of 
Sample Space are used as to what action has to be taken. Hence 
the test procedure is of partitioning of Sample Space, and from the 
two subsets namely the acceptance region for Н, and a rejection 
region for H, isformulated. In otherwords, the acceptance/region for 
the null hypothesis H,, consists of a set of values of thetest criterion 
for which H, will be accepted, whereas the rejection or the critical 
region for Н, is the set of values of the test criterion for which Hy 
will be rejected; the critical value of the test criterion is the bound- 
ary value that separates its range into acceptance or rejection regions 
which is known as the critical region. ‘ 


4.3. TYPE I AND TYPE II ERRORS 

The acceptance or rejection of a hypothesis is obvionsly based on 
the information obtained from the Sample and may lead to two 
type of errors. If the true value of the parameter y. is ш and if the 
experimenter inadvertandly concludes р. = рү, naturally is commit- 
ting an error which is referred as a type I error. Alternatively, if the 
true value of the parameter р. is ид and he concludes that р = руу an 
error is committed which is known as a type II error. 

Four possible situations arise in any test procedure and they are 
(i) Accepting H, when Ho is true (ii) Rejecting Но when Но is false 
(iii) Rejecting Hy when Но is true (type I Error) and (iv) Accepting 
Н, when Н, is false (type Ш error). 

Type I Error or the error of the first kind is considered to be 
made if a true null hypothesis is rejected. The probability of type 1 
error is denoted by ‘a’ and this called the ‘Significance level’ of the 
test or the size of the test. 


< — Жа ылыш шысы» 


e Пр жан 


-——— o — 


‚ а statistical hypothesis, 
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Type П error or the error of the second kind is made when a true 
alternative hypothesis is rejected which amounts to accepting the 


null hypothesis when it is false and the probability of type II error 
is denoted by ‘В’. 


2 x 2 table showing the correctness of decisions 


DECISION 
Accept H, and Reject H, and 
reject Н, accept H, 
Fact 
Wrong decision 
Type I error 
Но is true Correct decision pa 
(i.e. Н, is false) 
: Wrong decision 
(i A ena Type П error Correct decision 
RERO р= 6 


The region of rejection of Но is referred as the critical region and 
the size of the critical region is the probability of obtaining a value 
of the test statistic insidc the critical region. Hence the size of the 
critical region is the probability ‘а’ of committing type I error which 
as mentioned earlier is also known as level of significance, This re- 
presents the probability of rejection of Н, and obviously is to be kept 
as small as possible. By convention it is fixed at 0.05 and 0.01 (one 
in twenty and one in hundred) sometimes « is taken аз 0.001 (onein 
one thousand) and when the experiment is small « = 0.10 is also 
being used. 


The following are the various stages and the procedure for testing 


. 1. Formulate the null and alternative hypotheses so as to arrive 
at a decision rule enabling to choose one over the other. 
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2. Size of ‘о’, the significance level for the test and the sample 
size has to be flxed. 


3. The test statistic with a probability distribution known under 
the null hypothesis has to be decided. 

4. Having fixed ‘а’ decide on those values of the test criterion 
that will call for rejection of null hypothesis and those that 
Will call for acceptance (partitioning of Sample Space into 
acceptance region and rejection region). 

5. The data has to be collected and the value of the test statistic 
is to be computed (randomness of sample is to be ensured). 

6. 


From the value of the test statistic thus obtained, decision is 
to be made. 


To illustrate these, consider the following situations: 

You are considering to buy a new Scooter, but you are somewhat 
concerned by a recent news Teport about a serious traffic accident 
resulting from the failure of the defective mechanism in the new 
scooter. The salesperson assures you that the mechanism is perfectly 
alright and the scooter sold to, will not fail in traffic. You decide to 
consider the truth of this statement to be hypothetical. Construct a 
2x2 table and indicate the seriousness of any wrong decision. 

Here the null hypothesis is that the Salesperson spoke the truth 


and the alternative hypothesis is that the Salesperson told a false 
thing. 
Decision 
Hypothesis 


H, is accepted 


H, is rejected 
(buy the scooter) Ч 


(do not buy the 


scooter) 
LEUR а 5. 
Hy Good decision Poor decision 
(type I error) 
Н;: Poor decision Good decision 
(type II еггог) 


defective mechan- 
ism, accident prone 


| 
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2) Consider a normal distribution with variance 1 and it is 
desired to test Н, :u = 0 and Н, : p = 2. 

If*Y', the test static is too large и = 2 to be suggestive and 
accordingly if we chose X — K (where *K' is some appropriate 
chosen value) to be the critical region, the figure below illustrates 
the situation. 


Fig. 17. 


Hence, the figure shows the density function of X corresponding 
to H, and H,. The areas corresponding or defining « and В are 
indicated in respect of the critical regoin X > К = 1.75. From this 
it is demonstrated that by adjusting the value of К, the critical 
region as to how а can be arbitrarily made small and how В in- 
creases in the process. 


4.4. TESTING OF HYPOTHESIS FOR MEAN 

Among all the parameters which describes a probability distribu- 
tion, the mean is of paramount utility and hencelet us consider the 
problem of estimating the expectation and of testing of hypotheses 
concerning the expectation on the basis of the information obtained 
from a sample. The sample mean X is the statistic aud mainly Us 
the population mean is estimated using X. The testingof hypotheses 
concerns with р, using X as test Statistic. As the observations studied 
are random samples, naturally the sample mean is a random varia- 
ble. If the mean of a sample is not equal to u, we would need to 
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know, how close it should have been for which we require the 
probability distributiou of Y. 
The mean of the sample means is equal to the population mean 
ie. E(X)=p 
since, by additive property of expectation, we have, 
EX) = z|% +X, + Xs +...+ а 


1 
= $4 E06) + E(X,) + EX) +...+ Е(Х,) 
1 
zm при] = 
Similarly it can be seen that the variance of X is equal to Е 


ie. Yon = v Ex |= йу + VK) +..+ ИХ 


Thus it is obvious that the variability in Y, as measured by its vari- 
ance decreases as и increases, further as n tends to infinity, the V(X) 
tends to zero. 

Also the probability distribution of X tends to normal distribution 
as n tends to infinity. 

By repeated Samples of size ‘п’ from any population, the frequency 


distribution of the sample means X has mean p and variance ын The 


standard deviation of Х namely v is also known as 'Standard 
n 


Error.’ Thus if Ху, Хү, X4...x, constitute a random sample from ап 
infinite population with mean p and variance c? then the limiting 
distribution of 
x—u 
= = as п->со 
ој уп у 
is a standard normal distribution [N(0, 1)]. 
Consider now, how a confidence interval be used іп a test. We 
know that 


ӯ .9 5-1-0 
E а 95 
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STANDARDISED NORMAL VARIATE ‘Z’ 


Just significant Just signficant 
at 5% lvl. at 1% lvl 


p=0 05 p=0-01 


-258 0 +258 


igni t significant 
PX i. атк M 
p«0-05 p«0.01 
4 ДЕ, 
-196 0 +196 
Fig. 20 Fig. 21. 
Е 


not significant 
at 195 lvl. 


not significant 
at 556 [vl 


p»0 05 


104 PRE-UNIVERSITY STATISTICS-II 
which can be written as 


[4 > с 

еер МЕЕ 

and this inequality has the probability 0.95, that the interval comput- 
ed from X will include the actual fixed value of y. 

Consider the hypothesis H, : џ = цу, with the two sided alterna- 
tive 52 ро (assume 507 is given and is the same for both H, and Hj) 
and we adopt as a test that Ну will be rejected if u, does not lie т 
the confidence interval, i.e. 


X — 1.96 


if| wp — X | > 1.96 2 
| uo | m 


which in other words amounts to saying that Y falls farther from Uo 
than a criterion and thus the confidence coefficient 0.95 is 1 — 4, 
where « is the significance level of the test (х = 0.05). The most 
frequently used values of а, as mentioned before, «ће probability of 
а type I error are 0.05 and 0.01 and the corresponding values of 52” 
from the table are 1.96 and 2,58 respectively. 


4.5. WORKED EXAMPLES 

1. Suppose it is known from experience that the Standard devia- 
tion of the weight of 100 gm tooth paste made a certain firm is 0.25 
gm. To check whether its production is under control in a given day, 
that is to check whether the true average weight is 100 gm., a random 
sample of 36 were selected and found that the mean weight x to be 
100.217 gm. Since the firm stands to lose money when и > 100 and 
the customer loses when u < 100, test the null hypothesis ш = 100 
against the alternative р == 100 using а = 0.01. 
Here Jef, џи = 100 

H, : џ 5 100 

The critical region | z | > 2.61 = 2.58 


x— PAM 
EUIS 27 == | 9 =; substituting the values for x, ир, c and n 


100.217 — 100 _ 0.217 
0.25/^/36 0.0417 


Since z calculated > 2. reject the null hypothesis and it is advis- 
able to improve the production process. 


= 5.2038 
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2. The security department of a factory wants to know whether the 
true average time required by the watchman to walk his round is 30 
minutes. If, in a random sample of 32 rounds, the.night watchman 
averaged 30.8 minutes with a standard deviation of 1.5 minutes, deter- 
mine at х = 0.01, whether this is a sufllcient evidence to reject the 
null hypothesis » = 30 minutes in favour of the alternative hypo- 
thesis р 52 30 minutes. 


Solution: Null hypothesis Hy: p = 30 
alternative hypothesis: НЗ 30 
The critical region | Z| > 2. = 2.58 


X—p _ 30.8 — 30 _ 0.8 
«Гуд 1.5132 02652 
Since Z > Z. reject the null hypothesis. 

3. А Sample of 400 students is found we have a mean weight of 
43.25 kgs. Can this be regarded as a Sample from a large population 
with mean weight of 42.71 kg. with a Standard deviation of 6.28 kg? 
The null hypothesis Но: p = 42.71 kg. 
Alternative hypothesis is Hı : џ 5 42.71 kg. 
Consider the critical region | Z | > Z.o, = 1.96 
pe 43.25 —42.71 _ 
ај уп 6.28/ / 400 
Since Z = 1.72 which is less than 2.05, we accept the null hypo- 
thesis. 

Decision: The apparent increase in the sample mean weight may 
be attributed to chance. 

4. Taking the Standard deviation for pulse rate in adults as 8, 
would you say a high pulse rate was diagnosed if in а group of fifty 
sufferring from a certain disease the average pulse rate were 75 as 
against a normal rate of 70? 


Н, : ш = 70 

Н, : р - 70 

Consider the critical region | 2 | > Z.o9, = 3.00 
2 и 5-17 — 

с[уп 8/450 — 


3.02 


and 


and 2 1.72 


4.42 
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Since Z > Ziggy, 
Decision: Reject the null hypothesis; a high pulse rate was diag- 
nosed at « — 0.001. 

5. A population is known to be normal and to have a Standard 
deviation of 0.104 seconds. A random Sample of 12 items has a 
mean of 12.33 seconds. Calculate the 95% оловно limits for the 
population mean. 

We have known that 


¥-196 —<pe¥ +196 


Мп Мп 
since DE X —u|« 196 zx — 0.95 
Мп 
Hence the confidence limits for the population mean are 
0.104 0.105 
12.33 — (1.96) —— < p < 12.33 1.96) —— 
( Nip С + (0.96) Ss 


ie. 12.33 — 0.0588 < p < 12.33 + 0.0588 
12.2712 < р < 12.3888 


The 95% confidence units for the population mean are 12.27 and 
12.39 or only 5 out of one hundred may have values less than 12.27 
or more than 12.39. 


4.6. TEST FOR EQUALITY OF TWO MEANS 

One of the important test of significance in applied research is the 
one which deals with the problem, whether the observed difference 
between two Sample means be attributable to chance or whether it 
is an evidence to point outthat the Samples came from populations 
with different means. 

Consider the independent random variables X, and X, which we 
wish to compare with respect to their means, so that the comparison 
is to be made on a random Sample of size n, and a random Sample 
of size п, from X, and X, respectively. We are dealing with indepen- 
dent random samples of size п, and п, from two normal populations 
having means ш, and u, and the known variance с, and o,?. The 
null hypothesis considered in this situation, usually is ш, = м» where 
Ua = ЕС) and u, = £(X;) and the alternative hypothesis naturally 
is py F us 
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Test Procedure: Let x, and x, be the means of samples of sizes 
n, and n, from X, aud X, respectively and since these two samples 
are from the normal population then (x, — x,) has a normal distri- 
2 2 
bution with mean (ш, — и.) and the variance (= + = ) since 
1 2 


Е (51 — X) = pi — ра 


pr S to? 140: 
and V (%, — %) = ^ + = 
If it is desired to test the null hypothesis п, — p = 9 where 8 is а 
specifled constant against the alternative hypothesis p, — us Æ $ the 
test based on the difference between sample means (X, — X,) and 
thus 


IRE ER 
The critical region Z, is considered (x — 0.05 or 0.01 or 0.001) and 


if|Z| > Z,, the null hypothesis is rejected, on the other hand if 
12| < Za, the null hypothesis is accepted. 


4.7 ILLUSTRATIVE EXERCISES 

1. Sample survey conduced isa large extension in Bangalore in 
1970 and again in 1980, showed in 1970 the average height of 400 
ten year old boys was 53.2 inches with a standard deviation of 2.4 
inches, while in 1980 the average height of 500 ten year old boys 
was 54.5 inches with a standard deviation of 2.5 inches. Test the 
null hypothesis uj — us = — 0.5 against the alternative hypothesis 
ба — ua < — 0.5 at the level of significance х = 0.05. 


Null hypothesis Ну: ts — из = — 0.5 
Alternative hypothesis Hy: p, — p, < — 0.5 
Critical region «-0.05 |Z|2 2,=1.96 
2с3:-3-38. 


ТОН 
ЦЭ 


where 
СА 
n 
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Here x, — 53.2 inches о, = 2.4 inches n, — 400 
X, = 54.5 inches а, = 2.5 inches n, — 500 

and в=—0.5 

53.2— 5455 — (=0:5)7 = 038 
(2.4? (2.5). S 0.164 

400 500 

since |Z | > Z.o5, the null hypothesis is rejected. 
2. A sample study was made of the number of business lunches 

that executives claim as deductive expenses per month. If 40 exe- 

cutives in an insurance industry averaged 9.1 such deductions with a 

standard deviation of 1.9 in a given month, 50 bank executives 

averaged 8.0 with a standard deviation of 2.1, test the null hypo- 

thesis py — us = 0 against the alternative hypothesis u, — p, 0 at 

a= 0.05. 

Thenull hypothesis Ag: р — pa = 0 Or py = py 

Alternative hypothesis Ho: pı — a 40 ог p, 52 us 


Z= 


= — 4.878 


The critical region 121= 2. = 1.96, 
where ES a 
Сана 
Jz a пә 


In this problem X, —9.1, e,— 1.9, т = 40 and 8 — 0 
X,28.0, 0, = 2.1, п, 50 


9.1— 8.0 1.1 

Fra ав Очу. et 
(L9* @1 0.424 2.6042 
40 50 


Since | Z | > 1.96, the null hypothesis is rejected. 

3. Samples of students were drawn from two universities and 
from weights (in kilograms), the means and standard deviations are 
calculated. Make a large sample test to test the significance of the 
difference between the means. 


Mean S.D. Size of Sample 


University А: БЕ 10 400 


University B: 57 15 100 
[Delhi Uni. B.Sc.(Sub) 1967] 
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Null hypothesis H,: pı = us: the sample mean weights do not differ 
significantly 
Alternative hypothesis Hy: py = Ho ђе. By > Ho OT Ho > pa 
under the null hypothesis, the critical region 
|Z| > 7 =2.58 
Х = о 55 — 57 


-= = — 1.2649 
where Z TE. ой 102 5 152 
mS UA 490 ` 100 


|Z| = 1.2649 < 2, 


Conclusion: Accept the null hypothesis, there is no difference bet- 
ween the sample means; the difference of 2" between universities is 
not large enough to reject the null hypothesis. 

4. In a medical examination of college male students, it was 
found that the average chest girth in 2469 male students belonging 
to college 4 was 28.3 inches with a standard deviation of 1.84 inches. 
For another 2142 male students of another college B, the average 
girth was 29.8 inches with a standard deviation of 2.11 inches. Is 
there a signiffcant difference is chest girth between the two colleges? 
Null hypothesis: Но: pı = us; there is no difference between the 
mean chest girths amongst college A and B. 

Alternative hypothesis Н, : ш, 5 Ho 


Given 
n, = 2469, = 28.3 inches о; = 1.84 inches 
п = 2142, = 29.8 inches о, = 2.11 inches 
x, — > 28.3 — 29.8 
Z= = = 25.55 
CAN (1348 | (217 
I 2469 ^ 2142 


| Z| = 25.55 > 2, 


Decision: Reject the null hypothesis at the level of significance of 
« — 0.01. Hence the observed difference between the 
sample means (13 inches) is highly significant indeed. 


5. The mean yield of two sets of plots and their variability are 
given as follows: 
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Set of 40 plots Set of 60 plots 
Mean yield per plot 1258 Ib 1243 Ib 
S.D. per plot 34 Ib 28 Ib 


Examine whether the difference in the mean yields of the two sets 
of plots is significant. (IAS 1965) 
Null hypothesis, H, : u, = us. There is no difference in the mean 


yields of the two sets of plots. 
Alternative hypothesis : №, : ш Us 


Let us consider the level of significance « — 0.01 
М м-в. _ 1258 — 1243 
645 Sag 342 28° 
m "nà 440 150 
Thus | Z| < Z. the null hypotnesis is accepted. 

There is no difference in the mean yields of the two sets of plots. 
Supposing we take the critical region | Z| > 7. = 1.96, then by 
referring to the calculations Z = 2.3155 which is more than 1,96 
(2.05), hence the null hppothesis is rejected. 

Thus the null hypothesis is rejected at « = 0.05, whereas at 
« = 0.01, the null hypothesis is accepted. 


= 2.3155 


4.8 EXERCISES 


1. Suppose the manufacturer of the new medication wants to 
test the null hypothesis p = 0.90 against the alternative hypo- 
thesis u = 0.60. His test statistic ‘X’, the observed number of 
successes in п = 20 trials and he will accept the null hypothesis 
if x 15, otherwise he will conclude that u = 0.60. Evaluate 
the probabilities « and 8 (type I and type II errors). 

2. Suppose that 100 tyres of a certain brand lasted on the average 
21,431 miles with a standard deviation of 1295 miles. Using 
æ = 0.05, test the null hypothesis р = 22,000 miles against the 
alternative hypothesis ш < 22,000. 


3. Suppose that the specifications for a certain kind of ribbon call 
for a mean breaking strengths of 85 kg. and the 45 pieces 
randomly selected from different rolls have a mean breaking 
Strengths of 83.1 kg with a standard deviation of 3.8 kg. 
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Assuming that the data are random samples from a normal 
population, test the null hypothesis џ = 85 kg against the 
alternative hypothesis ш < 85 kg at æ == 0.01. 

The mean age at entry to a medical college was known to be 
18.5 years. The Registrar noted in one year that the mean age 
ofthe 63 students entering was 18 years 4 months, with a 
standard deviation of 9 months. Was this difference significant? 


The mean plasma volume at term in 32 women in their first 
pregnancy was 3.71, standard deviation 0.621. In their second 
pregnancy, the mean plasma volume at term was 4.21, stan- 
dard deviation 0.61. Is this difference statistically significant? 
According to the norms established for a reading comprehen- 
sion test, the 8th standard students should average 84.3 with 
a standard deviation of 8.6. If 45 randomly selected eighth 
standard students from a certain school averaged 87.8, test the 
null hypothesis р = 843 against the alternative hypothesis 
р > 84.3 using а = 0.01. 

The mean plasma cortisol concentration of healthy stressed 
subjects at 9 A.M. is 450 nmol.1-! with a range of 150-700 
п mol.1-!. Plasma cortisol concentrations under similar condi- 
tions in a group of 33 patients with custing's syndrome was 
1150 n mol.1-! with a standard deviation of 30 n mol. 1-1. Is 
this difference statistically significant? 

Suppose that the nicotine contents of 2 brands of cigarretes 
are being measured. If in an experiment fifty cigarretes of the 
first brand had an average nicotine content of 2.61 milligrams 
with standard deviation of 0.12 mg, while forty cigarretes 
of the second brand had an average nicotine content of 2.38 
mg with a standard deviation of 0.14 mg, test the null hy- 
pothesis p, = р against the alterative hypothesis u, Ap. using 
a = 0.05. 

In the comparison of two kinds of paints, a consumer testing 
service finds that 44 one gallon cans of one brand cover on an 
average 512 square feet with a S.D. of 31 square feet, while 34 
one gallon cans of another brand cover on the average 492 sq. 
ft. with a standard deviation of 26 sq. ft. Test the null hypo- 
thesis ш, — р, = 0 against the alternative hypothesis ба — 520 
at the level of significant « — 0.05. р 
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The mean diastolic blood pressure in 86 general practitioners 
whose body weights were above the 95th centile of weight for 
age was 88.6 mm.IHg, standard deviation 15 mmlHg. The com- 
parable figures for 90 general practioners of a similar age range 
whose weights were at or below the 50th centile were 82.0 
mmHgs and standard deviation 13 mm.Hg. Did the mean 
diastolic blood pressure differ in the two groups? 


The number of accidents per day were studied for 144 days in 
a city A, and for 120 days in city В. Themean number of acci- 
dents and standard deviations were 5.3 and 1.3 for city А and 


4.7 and 1.2 for city B. Is city 4 more prone to accidents than 
city B? 


- A group of 120 students of a college take an entrance test and 


obtain a mean score of 55 with a standard deviation of 11. Other 
group of 90 students from another college take the same test 
and obtains a mean score of 49 with standard deviation of 13. 
Testthe hypothesis that the two groups are random samples 
from the same population. 

In examining the average lifetime of a TV tube, 51 T.V. sets of 
brand C had an average lifetime of 497 hours with a standard 
deviation of 110 hours whereas out of 47 T.V. sets of brand 


D the average life time was 502 hours with a S.D. of 180 hours. 
Is this difference significant? 


Memes survival explicable'as a random 
uctuation : 


5. Vital Statistics 


51 INTRODUCTION 
The population of a country changes every moment due to recur- 
ring births and immigration which add to the existing figure and 
reduce/subtract due to emigration and deaths. The system of count- 
ing of these, of births, marriages, migration, diseases and disabili- 
ties and deathsis known as vital Statistics. In other words, the count- 
ing of vital events (important events in life like births, marriages, 
divorces, separation, diseases and deaths) which evidently a conti- 
nuous process is known as Vital Statistics. Since counting of the 
entire population known as census involves heavy expenditure, time 
consuming and require heavy manpower the quick estimates of 
population by collection of vital events can be an alternative and con- 
venient method. The census forms a record of persons whereas the 
other is the record of events. The numerical portrayal of human 
population which is termed as Demography is a study of the aggre- 
gate and not about individuals. у . 
The system of collection and recording of birth and death dates 
back to 1250 B.C. in the reign of King Ramses II of Egypt, wherein 
a comprehensive birth and death registration was introduced. John 
Graunt of England in 1662 developed and published a book *Not 
Professing Letters’. He analysed by collection and tabulation of 
vital records and could give as to the cause of death, age and sea- 
sonal variation also. He was neither a doctor nor a statistician and 
his curiosity of vital events brought out a contribution to a system 
of scientific method of predicting and prolonging the life time of 
human lives by the use of vital records. Dr. William Farr of England 
during 1839 developed Analysis of Mortality Statistics. The'registra- 
tion of births and deaths are compulsory worldwide and information 
are collected by the method of house to house enumeration and are 
processed which obviously a gigantic taske and are presented in the 
published documents. 


5.2. SOURCES AND USE OF VITAL STATISTICS 
As pointed out earlier, vital statistics is the numerical information 
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obtained by counting and classification of information from the 
vital records. These records refers to all the individuals in the com- 
munity. Hence the source of vital statistics is the individual and his/ 
her family and as and when ап event occurs in the family like, birth, 
marriage, diseases, deaths etc. and if these are registered and recorded 
and the process of pooling of all such events to the entire community 
formulates the system of vital statistics which throws some light 
on the magnitude of the situation with regard to the total number 
of births and deaths occurring. This helps the government for plann- 
ing and management and also helps to take precautionary steps 
for the development works on the basis of the trend of the growth 
of population. The importance of registration of births, deaths for 
individuals are manifold. The birth certificate gives a legal document 
for having been born is essential for entering school, insurance, join- 
ing service, qualify for voting, obtain inheritance benefits and so on. 
Similarly the purpose of death certificate is for getting insurance 
benefits of the diseased, family pension, settle properties and so on. 
In the same lines marriage certificate is essential for an individual 
since it proves the legality of marriage, legitimacy of offspring and 
to divorce also. Halving considered the importance of vital statistics 
for individuals, let us consider how government can utilise for wel- 
fare purposes. The government has to look after the health of the 
whole of the inhabitants and counting is imperative since the entire 
community is to be looked after. The health examination depends 
on the recording of all the vital events occurrin 
house hold. The vital records enable to diagnose the ill-healths 
of the community. The other uses of vital statistics are that the 
socio economic conditions, the proportion of dependents, family 
welfare programmes, age and sex distribution, insurance and the 
entire planning for future developments can be envisaged using such 
records. Further private enterprise, plan to manufacture articles like 
baby food, umbrellas, jewels, housing etc, on the basis of the trend 
of births апа deaths prevailing. 
The United Nations has developed a consenses 
vital statistics system’ and the recommendations are 
over the world with improvements in advanced co 


g in each and every 


‘Principles for a 
implemented all 
untries. 


5.3 BIRTH AND DEATH RATES 
The numerical information about the 


! у number of people or events 
found at a certain date or period in aco 


untry is an absolute number. 


VITAL STATISTICS 115 


For most purpose it is enough to know how many births or deaths 
occurred in a year but sometimes if it is desired to measure these 
facts in relation to some other number say ofthe total population 
arelative number is obtained. This helps for the purpose of com- 
parison. One such relative number used is a ‘rate’ which is the figure 
obtained by division; wherein the numerator gives count of number 
of times an event has occurred and the denominator gives the total 
number of individuals exposed to the risk of that event. 


x 
Thus rate = XE 


53 the numerator being а part of the denominator. 


Birth rate: The birth rate or the crude birth rate is the ratio of the 
number of live births occurring in a geographical area during a speci- 
fied year to the mid year estimated population of the same geo- 
graphical area during the same year expressed per 1000. 

Hence 
Total number of births for 

a year in a countr 

WESS population E the ~ 1009 

year in the same country 
Since numerator is considerably less compared to the denominator, 
to avoid fractions, the rate is expressed per 1000 as a convention. 


B 
C.B.R. =РХ 100 


Crude birth rate = 


where B — total number of births registered during the calender 
year (January 1 to December 31) and P is the population at the 
middle of the year (July 1). Two questions creep in while under- 
standing the birth rate. First, why is it called crude? and what about 
an improved form? Second, the midyear or average population— 
how is it estimated? 

Since the number of births occurring for a population depends on 
the proportion of females in the reproductive age group and since 
the denominator in the crude birth rate comprise of the entire popu- 
lation. Alternatively, if the live births are related to female population 
in the reproductive age group, an improved form is obtained which 
is known as fertility rate. Fertility is the actual performance in a 
population as far as the number of births are concerned. Hence 
Хонь is measured аз a frequency of births occurring in a popula- 

ion. 

The following are some of the fertility rates in common vogue. 
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(i) General fertility rate: This is defined as the ratio of the totai 
number of births to the female population in the child bearing or 
reproductive age group of 15-49 years. 


Total live births occurring in a geo- 
graphical area during a year 
GER: Female population in a geographical хоо 
area during a year in the age group 
of 15 to 49 yrs 

Here the numerator refers to both legitimate and illegitimate births 
and the denominator do not take into consideration the marital 
status of women. The broad age of 15 to 49 yrscan be split further 
for the purpose of finding out the most vulnerable group as faras 
fertility is concerned for planning, age specific fertility rate. 


(ii) Age specific fertility rate: This is the ratio of the number of 
births by age of females to females for age interval, usually taken 
in 5 year intervals. 

Total live births in a year for 

females in the age group 15 


Age S.F.R. _ . . fo 19 years 
(for say 15-19 yrs) female population in the age ~ 1000 
group of 15 to 19 years ina 
year 


Similarly fertility rates for other age groups like 20-24 yrs, 25-29 yrs 
and so on can be obtained. By these age Specific fertility rates the 
fertility frequency for a calender year as also the fertility perform- 
ance is identified. This is an extremely useful and valuable indicator 
for knowing the reproductive career for family planning programme 
also. 

(їй) Total fertility rate: It is the sum of the age specific fertility 
rate at each age from 15 to 49 years. This gives the average num- 
ber of babies born to a female passing through the reproductive 
age group which helps to tackle for motivation purpose for small 
family norms. 

The second aspect in respect of the computation of birth rate is 
is about the mid-year estimated populatio: 
While computing the birth rate the num 
the total number of births occurring thro 


n or average population. 
erator takes into account 
ughout the year, whereas 
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the population to be considered for the purpose of relative measure 
refers to a particular point of time. Hence the average population 
for a particular year known as mid-year estimated population is 
considered (156 July of a year). Since population are not counted 
every year but only once in 10 years, estimates of the population 
from the preceding census figures are made. The following are 
some of the methods of estimating the population for the inter-censal 
and post-censal years. 


(a) Natural increase method 
Population in year ‘1’ = (Births — Deaths + Imigration — Emigra- 
tion) in year ‘0’ + Population in year 40” 
This is a very satisfactory method if the registration of births and 
deaths are complete and comprehensive. The accuracy of this 
method depends naturally on complete registration of births, deaths 
and migration and underregistration will distort the figure of the 
estimate to a considerable extent. It is not possible to project the 
future trend of the population using this method. 


(b) Arithmetic progression method 
It is assumed that the population increase in A.P. year by year by 
comparing the figures of any two census years. 


Illustration: Suppose it is desired to estimate the population of 
India as on 1.7.1969. 

Consider the census population of India for 1961 and 1971 
then the population as on 1.7.1969 


= pop. as per census 1.3.61 + (increase in pop. between 
1.3.61 to 1.7.69) 


1961 census population 439 millions 
1971 census population 547 millions 
increase in 10 years: 108 millions 
increase in 1 year : 10.80 millions 
increase in 1 month: 0.90 millions 


increase inl day ; 0.03 millions 
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Hence the mid-year estimated population of India (as on 1.7.69) 

= 439 millions -- (increase between 1.3.61 and 1.7.69 i.e. for 8 

years and 3 months) 

== 439 millions + 8 х 10.8 millions + 4 x 0.90 millions 

= 529 millions. 
In this method we assume the increase Per year between two census 
populations to be constant which may not be so. Further on the 
basis of this assumption the population in 1981 should have been . 
547 + 108 = 655 millions, but the census figures for 1981, excluding 
the enumeration in Jammu & Kashmir and Assam and obtaining 
the estimates for these states has revealed 684 millions. Hence it 
can safely be assumed that the population is not increasing by A.P. 
but increasing by a Geometric Progression form. 


(c) G.P. method 
Instead of assuming the increase between two census year popula- 
tions year by year to be constant if the rate of growth of population 
is assumed to be constant. Such a procedure is known as G.P. 
method of estimating the population. 
Population in year ‘1’ = population in year ‘0’ (1 + r) 
where ‘r’ is the rate of growth of population. 
In general 
n 
Py Ор. [44:56 
5 ( Pis 100 ) 
where P, = population, at year ‘n’ 
Р = initial population (population in year ‘0’) 
r = rate of growth of population 
n = number of years 
We can estimate the population of India 


for the next censu 1991 
using 1981 census figure with an estimate, ү 


d rate of growth of 2.17 Ye 
Population in 1991 = population in 1981 ( 1+ us y 

a 2.17 үш 

= ва 1+ m) 


= 684 (1.0217) = 849.4 millions 
Apart from these three methods considered for the estimation of 
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mid-year population, by using numerical analysis techniques the 
estimates can be obtained more accurately and these are under- 
standably not dealt since it is beyond the scope of the coverage. 


Mortality rates: Death rate or the crude death rate is the ratio of 
the total number of deaths occurring in an area during a calender 
year to the mid year estimated population of the same area for a year 
expressed per 1000. 


D 
СВЕ. = р 1000 


where D = total number of deaths occurring іп a geographical 
area during a calender year (Jan. 1 to Dec. 31) 


P — Mid-year estimated population (1 July) of the same 
geographical area during a year. 


This rate is called crude since the denominator comprise of the 
entire population and since the risk of death varies from one age 
group to another. Age specific cause specific, and sex specific mort- 
ality rates are improved measures of the death rate. If elaborate 
study of the mortality conditions are needed for a community like 
the MCH work, the mortality of infants and maternals and further 
the insurance companies needs the mortality rates in each age. 

Sex specific deaths rate (females) is the ratio of the number of 
female deaths to the mid-year female population expressed per 1000. 
Similarly it can be computed for males also. 


Age specific mortality rates 


Infant mortality rate: It is the ratio of the number of infant deaths 
in an area during a year to the total live births occurring in an area 
during a calender year expressed per 1000. 


LM.R. — = х 1000 


where D, is the number of infant deaths (0 to 1 year of age) 
during the year and b is the number of live births in the same year. 


The infant mortality rate is further subdivided into neonatal and 
post-neonatal mortality rates. In the former deaths of infants between 
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0 to 28 days after birth are considered whereas in the latter deaths 
between 29 days and 1 year istaken in the numerator. 

The infant mortality rate is considered as the most sensitive 
index of measuring health conditions of a community since the baby 
is exposed to altogether new environment and thus reaction are 
well exposed by this rate. If the infant mortality rate is high, it 
means that the immunisation programme is inadequate, the nutri- 
tion of the mother and child is not satisfactory, environmental sanita- 
tation is poor and so on. In an ideal situation at least theoretically 
this rate must be almost equal to zero. 


Maternal mortality rate: Thisis defined as the ratio of the number 
of deaths of mothers due to pueperal causes in an area during a year 
tothe total number of live births occurring in an area during a year 
expressed per 1000. 


where т, = number of maternal deaths occurring in a calender year 
(deaths due to pregnancy and delivery of a baby) 
b = total number of live births in a year 


Here in the numerator only deaths ascribed due to pregnancy and 
child birth and deaths of mother while delivering a baby only is 
considered. If a pregnant mother dies due to road accident, the cause 
of death being different and are not taken in the numerator while 
calculating the M.M.R. 

The death rates can be calculated in the same lines for other age 
groups as well (1-4 years, 5-14, 15-44, 45-64, 65-н) and these are 
termed as mortality rates for pre-school, school 


going, early adoles- 
cence, late adolescence, old age and the like, 5 Venise 


5.4 STANDARDISED DEATH RATE 
The crude death rate do not take into cognisance, the age aud sex 
composition of the population and if it is desired to Compare the 
crude death rates of two places, conclusions drawn on the basis of 
these rates becomes erroneous. The standardised death rate, a 
theoretical rate calculated for the Purpose of comparison are edie 
puted wherin the death rates of all ages are used in such a way as 
to give weightage for the age and sex composition involved ‘ithe 
population 
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This is illustrated from the following data; 


Populatian Number of deaths 
Age group 

P | Q R P | Q R 

0-4 3500 | 3000 | 2800 110 55 57 
5-14 1500 | 2000 | 1200 70 65 | 44 
15-44 2000 | 2500 | 3000 25 34 | 96 
45-64 1000 | 1000 | 2000 15 108 | 65 
654- 2000 | 1500 | 1000 |.121 64 | 54 
Total | 1000 10,000 22 341 327 


The crude death rates for Р, О and R can be obtained 


341 
10,000 ~ 1000 — 34.1 


C.D.R, for P — 


for Q and R the rates are 32.7 and 31.6 per thousand population 
respectively which indicates that place P has highest death rate. 
Alternatively ifthe death rates are computed separately for each 
group, theconclusion drawn appears to be fallacious as illustrated 
on next page. 

The calculation of death rates for each age group shows that the 
mortality rate in P is highest due to the factthat there are maximum 
proportion of old people and children (0-4 yrs.) as compared to 
О and R and naturally has higher number of deaths. Whilst com- 
paring the mortality rates of two or more places the crude death 
rates do not take into cognisance the age and sex composition of 
the population, which as illustrated above may be a contributing 
factor for the higher or otherwise of a death rate. Hence to elimi- 
nate the effect of variation of age and sex characteristics, Standardi- 
sation technique is adopted to find out the number of deaths that 
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Death rates per 1000 
Age group 

P Q R 
0-4 31.43 18.33 20.36 
5-14 46.67 32.50 36.67 
15-44 12.50 13.60 32.00 
45-64 6.00 108.00 32.50 
| 654- 60.50 43.33 54.00 

| 


would have occurred in say ‘О’ with the mortality experience of “Р” 
ог vice versa. 

By doing this it is assumed that the age distribution of the two 
places are same and the death rates are obtained as usual and this is 
known as ‘adjusted’, corrected or standardised death rate. Consider 
the situations of places О and К. 

The number of deaths that would have occurred in R with the 
the mortality experience of ‘О’ are 


Age group Number of deaths in ‘R? with 


No. of deaths in *Q' 
mortality experience of “О? 


With mort. exp. of В’ 
18.33 2 


0-4 1056 2800—51 096 3000—61 
5-14 22:99 x 1200—39 TE, 2000-73 
15-44 12:00 x3000—41 | 00 x 2500-80 
ed ST х2000=216 ший х 1000—33 
65+ 43-32 1000 —43 А00 1500—81 
Total 390 7328 


Wu аа о ИМ 
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Apparently area ‘Q’ shows a S.D.R. (with ‘R’ as standard) of 32.8 
whereas for area *R' itis 39.0. The crude death rate showed a re- 
versal of what has been infact the effect of the assumption of the 
age distribution as being a standard for the other. 

This is known as the direct method of standardisation. Instead of 
this, the age and sex composition of a standard population may be 
compared and the specific death rates of the standard population аге 
used for the respective age groups of the populations for the areas 
to be compared and the deaths thus obtained in each age group 
are added and divided by the total population of the area which is 
known as the index death rate. The standardised death rate is the 
product of index death rate and the crude death rate. Thisis known 
as an Indirect Method of Standardisation. 


5.5. SOLVED PROBLEMS 
1. The population of a town in Karnataka on 1-7-1983 was 40,000. 
The following vital events for the same year were: 

Number of births: 1320, number of deaths: 510, number of infant 
deaths: 120. Compute the birth, death and infant mortality rates. 
Also compute the growth rate. 


No. of births 


Birth rate — Mid:year population x 1000 
1320 2 
= арорр * 1000= 33/1000 
Death rate = Dei ох 1000 


Mid-year population 
510 


= 30000 Х 1000 = 12.75/1000 


No. of infant deaths 


Total live births 1000 


Infant mortality rate = 


120 
= 1320 x 1000 = 90.91 
_ BR Ю.К _ 33 — 12.75 20.25 i 
Growth rate — 10 = S = 2.025% 


The rate of growth of population in the town for 1983 was 2.025% 
(this is expressed per 100 instead of per 1000). 
2. The population of Karnataka as per 1981 census was 37.043 
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millions with birth rate and death rate of 34.3 and 12.7 respectively. 
What were the total number of births and deaths for 1981? 


B 
B.R. — рх 1000 


В 
34.3 = 37043000 ~ 1000 
Hence number of births(B) = 34.3 х 37043 = 12,70,575 
Similarly 


D 
Р.В. => х 1000 


1277 x 1000 


pis. Бы 
37043000 
Number of births (D) = (12.7) (37043) = 4,70,446 

3) From the following data which gives census population, births, 
deaths and migration of a city in India estimate the mid-year popu- 
lation for the year 1983. 

Census population of city ‘A’ in 1981: 1,32,000 

Vital statistics for city ‘A’ during 1981-84. 


Year Births Deaths Imigration Emigration 
1981 4105 1312 1500 1605 
1982 5706 1285 2585 2460 
1983 6111 1278 4105 4220 


For the estimating the mid-year population of 1983, the net increase 
in each year are to be calculated 


Year | Births-Deaths _ Imigration-Emigration Annual increase 


1981 2793 —105 2688 
1982 4421 +125 4546 
1983 4833 . —115 4718 
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Since the population refers to census of 1981 (which refers to as on 
1st March) the increase for 1981 will be 3/4 (2688) 


Hence the population as on 1.7.1983 1,32,000 
increase in 1981 (3/4 [2688]) 2016 
increase in 1982 4546 
increase іп 1983 ($ 4718]) 2359 

Total 1,40,921 


Since the population is needed for the mid-year, half of the net in- 
crease for 1983 is added to the increase for 1982 and 1981 which 
when added to the census population gives the estimate as 1,40,921 
by the natural increase method. 

4. The population of Australia as on 30.6.1981 was 1,49;26,800 
with a rate of growth of population of 0.8%. Estimate the popula- 
ation as on 30.12.1984. 

This can be done by the G.P. method 


Population of Australia — Pop. as on 30.6.81 ig тү 
(аз оп 31.12.84) ( ZA 10б) 
= 14926800 (1.008)5°5 
= 1,53,43,000 


5. -The following table gives the female population and number 
of births for various age groups of 1974. Compute the general fer- 
tility rate, specific fertility rates and the total fertility rate. 


Agegroup: 15-19 20-24 25-29 30-34 35-39 40-44 45.49 

Female 

population: 1593 1316 1172 1031 876 733 601 
(000) 

Number of 

births: 101952, 226352, 185176, 131968, 52560, 20524, 2885 
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Solution 


Agegroup Female population Number ot Age specific 


187000 births fertility rate 

15-19 1593 101952 I0 x 1000 — 64 
2024 1316 226352 226352 1000 = 172 
25-29: 1172 185176 28516 x 1000 = 158 
30-34 1031 131968 49208. х 1000— 128 
35-39 876 52560 E x1000— 60 
40-44 733 20524 x1000— 28 
45-49 601 2885 E х 1000 — 4.8 

Total 7322 D 614.8 
Hence, G.F.R. = number of births 


female population inthe age group of 15-49 yrs 


721417 ы 
7325000 ^ 1000 = 98.53 


Total fartility rate = sum of specific fertility rates 
— 614.8 x 5 — 3074.0 


i.e. 1000 females passing through 15-49 years give birth to 3074 
babies. Hence the T.F.H is 3.074. 

6. Compute (i) crude birth rate (ii) crude death rate (iii) age 
specific fertility rates for 25-29 and 35-39 and (iv) age specific death 
rates for the age group 65 +. 


х 1000 = 
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Age group Population Births Deaths 
Males Females 
0-9 17,100 16,900 nil 375 
10-24 14,300 15,700 510: 190 
25-29 11,600 10,400 1410 185 
30-34 8,400 9.600 2006 164 
35-39 7,200 6,800 950 150 
10-64 3,500 4,000 124 201 
654- 1,100 1,500 nil 402 
Solution 
д В 5000 i. 
(i СОВЕ. = РХ 1000 = 128100 ~ 1000 = 39.03 
Y D _ 1667 | P 
(1) СРЕ = р х 1000 = 128100 ~ 1000 -- 13.01 


(ii) Age specific fertility rate for 25-29 years 
number of births for 
_ females in 25-29 ВГОЧР ., 1000 
female population in 
the age group 25-29 ü 
_ 1410 
— 10,400 


x 1000 = 135.58 


Age specific fertility rate for 35-39 year = Бош х 1000 


(iv) Age specific death rate for 65-- 
_ number of deaths in the age group 65-- 
population in the age group of 65-- 
402 


= беруу Х 1000 = 154-62 


х 1000 
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7. Oalculate the standardised death rates for the population А 
and B, by taking population 4 as the standard. 


Place A Place B 
Age group Population Deaths Population Deaths 
0-4 15,000 680 20,000 701 
15-44 60,000 350 50,000 306 
45-64 25,000 270 30,000 289 
654- 10,000 180 8,000 176 


Solution: Let us calculate the crude rates for places 4 and B and 
also the age specific death rates for population ‘В’. 


1480 
C.D.R. of place А = T,10,000 х 1000 = 13.45 


UM x 1000 — 13.63 


O.D.R. of place В = 1.08,000 


Age group Age specific D.R. Age S.D.R. Number of deaths that 
for place ‘A’ of ‘B’ would have occurred in 
“А? with the mortality 

experience of ‘B> 


0-14 45.33 35.05 15000 x 35.05 —525750 
15-44 5.83 6.12 60000» 6.12--367200 
45-64 10.80 9.63 25000 х 9.63=250750 
654- 18.00 22.00 10000 x 22.00 —220000 

1353700 


S.D.R. of ‘A’ with ‘A’ as the standard is 13.45 


S.D.R. of ‘B with ‘A’ _ 1353700 _ 
of ‘B with ‘A’ as the standard 110,000 12.31 


Hence the death rate in * В? is less than the death rate in ‘Æ’. , 
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8. Given below is the data regarding deaths in two districts. On 
the basis of the given data calculate the standardised death rates. 


District A District B 
Age range Population No. of Population No.of Age distn. 
deaths deaths of a std. 
1000 
0-10 2000 50 1000 20 206 
10-55 7000 75 3000 ЗО КЛ 1593 
55 and above 1000 25 2000 40 211 


Computation of standardised death rates 


District A District B 
рор. Deaths Р.В. Pop. Death D.R. 514. (D.R.) (D.R.B) 
pop. (S.P) (5.Р.) 
А В 


Age 
group 


оло 2000 50 25.00 1000 20 20.00 206 5150 4120 


10-55 7000 75 10.71 3000 30 10.00 583 6244 5830 


above 1000 25 25.00 2000 40 20.00 211 5275 4220 


10,000 6000 1000 16669 14170 
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: ero 16669 
Standardised D.R. of district 4^ = 1000 


| 


16.67 


14170 


Standardised D.R. of district ‘B’ = “1000 


= 14.17 


9. Find the standardised death rate by the indirect method for 
the following data: 


Place ‘A’ Standard place 
Age group Population Deaths Population Deaths 
0-4 18000 882 20000 100 
5-14 21000 315 15000 225 
15-49 26000 260 20000 200 


50+ 5000 295 15000 900 


Place А Standard Places 


Age group Pop. Deaths Р.К. Pop. Deaths D.R. D.R. of Std. D.R of Std. 
X Std. pop. xpop. Á 
0-4 18000 882 49 20000 1000 50 1000000 900000 
5-14 21000 315 15 15000 225 15 225000 315000 
15-49 26000 260 10 20000 200 10 200000 260000 
50+ 5000 295 59 15000 900 60 900000 300000 
Total 70,000 1752 70,000 2325 2325000 1775000 


Solution: Computation of S.D.R. by indirect method 


Суга 3159) У | 
C.D.R. of place ‘А’ = 70000 Х 1000 = 25.03 
A 1775000 _ 
Adjustment factor = 3323000 = 0.7634 


S.D.R. of place *4' — 25.03 x 0.7634 — 19.11 


SOLLSILVIS TVLIA 


тет 
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EXERCISES 


- What do you understand by the term ‘Vital Statistics’? 
. Describe the nature, method of collection and usefulness of 


Vital Statistics. 


. Explain the main uses of Vital Statistics. Comment on the 


quality of Vital Statistics of India. 


· How does Vital Statistics differ from census data? 
. Explain ‘crude’ and ‘improved’ rates. 
- Define the terms (a) crude birth rate (b) general fertility rate 


and (c) crude death rate. 


- Define age specific mortality rates and age specific fertility rates. 
- Comment on the following statement, ‘The Infant mortality rate 


can be considered as a very sensitive measure indicating the 
health of a community’. 


. Explain the various methods of Intercensal and Postcensal 


population estimates. 


· Explain the terms ‘General fertility rate’ and “Specific fertility 


rate’. How are they calculated? 


. Mention some of the features of 1981 census. 
- The population of Karnataka as on 1.3.1981 was 370.43 lakhs. 


Assuming the rate of growth of population to be about 2.17% 
estimate the population as оп 30.6.1988. 

The mid-year estimated population for a town in 1984 was 
51,000. The following are some of the vital events available 
for the town for the year 1984, 


Number of births: 1651 
Mumber of deaths: 498 
Infant deaths: 145 


Number of deaths dueto cancers: 56 


Compute the various vital indices and comment 


- In one area of a city with a population of 1,97,163 in 1983 a 


total of 5910 births and 1970 deaths occurred. Compute the 
crude birth rate, crude death rate and also the growth rate. 
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15. The following data shows the distribution of population and 


16. 


17» 


some vital events thereof. 


Age group:0-1, 1-4, 5-14, 15-19, 20-44, 45-64, 65-- 
population 


(in thousand) 
Male: 3 7 П 14 8 6 
Female: 2 6.5 10.2 13.4 6 4 3 
Number of 
births: — — — 1170 3515 100 — 
Number of 


deaths: 140 215 104 435 506 184 701 


Compute the (i) crude birth rate (ii) crude death rate (iii) specific 
fertility rate for 15-19 years (iv) age specific mortality rate for 
0-1 year and 65+. 

Distinguish between crude death rate and standardised death 
rate. 

Why is it necessary to calculate the standardised mortality rate 
to:compare the mortality experienced in two or more places? 


18. Compute the crude and standardised death rates for the two 


places from the following data: 


Place A Place B 
Age Population Number of Population Number of Population 
group deaths deaths of the 
country in 
millions 
0-14 22000 280 19000 230 280 
15-64 38100 310 21000 305 340 
65+ 10500 420 10000 302 100 
19. Compute the crude and standardised death rates of the two 


populations P and Q, by taking P as the standard population. 
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P 
Age group Population Deaths Population : Deaths 
0-10 20000 600 11000 356 
10-20 12000 240 25000 585 
20-40 50000 1250 59000 1500 
40-60 30000 1050 14000 421 
60+ 10000 500 4500 174 


20. Calculate the crude and standardised death rates from the 
following data: 


Age group Population Number of deaths | Standard age 


distribution 

(percentage) 
0-9 21000 350 22.1 
10-24 30000 102 29.8 
25-44 37000 ^ 229 28.5 
45-64 17000 354 14.9 


65 and over 5000 419 4.7 


6. Index Numbers 


61 BACKGROUND 

Index number is a measure for comparing one group of related vari- 
ables with the other. The variables may be of comparison of prices of 
different articles of say food at one date with the prices of the same 
articlesat a different date, the ratio of which is expressed as a per- 
centage which is a. food price index number. Apparently index num- 
bers are nothing but averages for ratios. 

The index numbers are usually calculated annually but occassion- 
ally it is computed weekly, fortnightly or on monthly basis also. The 
base period which is the denominator of the variableto be compared 
with regard to any other later period may be of a week, fortnight, 
month or annual. By this comparison ofthe price of a commodity 
from the base year to another year, the upward or downward trend 
in the value of the variable is reflected. Usually the index numbers 
are calculated for prices, quantities and cost of living. The price 
index numbers are a measure of change in retail or wholesale price 
of a commodity or a group of commodities. The quantity index 
number is a measure of the changes in the quantity or volume of the 
manufactured goods in an industry. The cost of living index number 
is a combination of price and quantity. It is nothing but the cost 
of purchasing а set of commodities like food, housing, cloth, fuel 
and miscellaneous items. The cost of living index number indicates 
the changes in the prices of such commodities for the present time 
as compared to a certain period called the base year. 


6.2. USES OF INDEX NUMBERS 

Since index numbers display changes in the price, we can compare 
food or other living costs for а place during а given year from those 
of a preceding year or for comparing the production of a commo- 
dity during a current year as compared to some other previous 
periods. Apart from these, the purchasing power of money is also 
revealed by index numbers. As is well known the value of say 
a ruppee is slicing down and presently more money is required 
when compared to the past thanks to increased cost of goods and 


136 PRE-UNIVERSITY STATISTICS-11 


services and thus the economic condition of а country a recession 
or boon is well made out by the index numbers. 

Ав а policy the government has been neutralising the effect of 
increase in cost of. living index number of granting dearness allow- 
ance to its employees and presently for every increase of 8 points, 
the allowance is increased automatically. This has been the practice 
in almost all the public and private sectors too. Naturally over a 
period of time wages are very much on the higher side for most of 
the categories which is evidently due to manifold increase in the 
Cost of living. The index numbers are also used for forecasting busi- 
ness and economic conditions for a later period since it provides 
information on seasonal fluctuations, wage index, production index 
and so forth. 

The technique of index number compilation is extended from 
economic and business planning to various other fields also. In the 
field of education evaluation of teacher or taught from one year to 
another or from one area to another is used for planning and research. 
Similarly intelligence quotient is measured and provided in the same 
line as index numbers. 


6.3. CONSTRUCTION OF INDEX NUMBERS FOR PRICES 
AND QUANTITIES—BASIC NEEDS 
Before embarking on the Construction of index numbers either for 
prices or quantities, the following are some of the prerequisites to 
be considered. 
It must be clearly made out as to the purpose of the index num- 
ber compiled, with regard to whether they refer to group of items 


which it concerns. Normally the items considered for the compila- 
tion of wholesale index numbers are food articles, industrial raw 
materials, manufactures and miscellaneous goods. These are further 
subdivided into subgroups each containing severa] commodities 
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representing the subgroup. In the case of the group ‘food articles? 
are divided into cereals, pulses, beverages and other food articles. 
Further if the purpose is to construct a cost of living index fora 
specified category of the population, the items normally not used 
by the said group can be omitted. After deciding about the items 
for inclusion, the information about the prices and the quantities 
consumed at various places and periods needs due consideration 
since this happens to be the main variable as far as the construction 
of index numbers are concerned. The data on prices are to be 
obtained from authentic, reliable and official publications. The 
Bureau of Economics and Statistics, Dept. of Marketing, Janatha 
bazaars, fair price shops, AllIndia Radio bulletins, trade journals and 
daily newspapers are some of the most reliable sources from where 
the data can be obtained. 

As soon as the prices of the commodities are obtained, the pro- 
blem encountered is about the choice of the base period. Whetherthe 
base period to be taken for week, fortnight, month or annual is to 
be looked into. Since the changes are unlikely to occur frequently 
and since the index numbers are to be computed for quite a long 
span of time, usually the year is used as a base period. Thus the 
average price prevailing in a base year (say 1971) is taken as 100 and 
the value of the price for another year (say of 1983) are expressed 
in terms of 1971 (base year). Similarly the total production of a 
particular commodity of a year is compared to the production 
during the base period. Care must be taken in selecting the base year. 
The year thus selected should be one in which there should be no 
undue economic recession or boon. In other words, the base year 
selected should as far as possible be free from war, flood, drought, 
labour problems and the like. Ultimately the base year be neither 
abnormal or subnormal year but just a normal year since the price 
relatives are compared and these are essential for planning and deci- 
sion making. In practice two types of index numbers are obtained, 
One is the fixed base method, wherein base year condition is related 
with respect to the present or the vice versa. If the interval between 
the base period and the current period is too wide, the pattern of 
consumption, the dynamics of attitudinal change are not revealed and 
hence another method namely the chain base method is used where- 
in the base year is altered every year. The base year for the purpose 
of comparing prices or quantities of the current year is the previous 
year and for the succeeding year the base period shall be the current 
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year. This method has an advantage since items can be deleted or 
added as per situation. But the greatest disadvantage in this is that 
the base period being a standard and the situation at a later period 
is studied with regard to this standard or yard stick and since com- 
parison are not possible if the standards are changed. 

Finally, the choice of and the method of arriving at a single index 
number which summarises a large information is to be decided since 
several methods exist for averages. Obviously several index numbers 
can be computed using such averages each o! which possess certain 
merits and demerits. 


The following are some of the methods used in practice: 


Price relatives: This is defined as the ratio of the price of a single 
commodity in a given period to its price in another period called the 
base or reference period. This is generally expressed as a percentage. 


Price relative — AL y 100 


0 
where р; = price of a commodity for given period 


Py = price of a commodity during the base period. 


Example: The consumer price of say *The Hindu' in 1984 and 
1980 were 0.90 p and 0.75 p respectively 


: А 0.90 
Price relative = 075 х 100 = 120% 
Thus the price of the newspaper *Hindu' іп 1984 15 120% of that in 
1980 which means it has increased by 20%. 


Quantity relatives: Just as price relatives wherein comparisons are 
made about prices, it may be necessary to compare the quantity or 
volume of the products, quantities, export, industrial production 
etc. 


Quantity relative — 2 х 100 
0 
where g, = quantity of a commodity produced, consumed, imported, 
exported etc. during a given period and 4, = quantity of a com- 
modity produced, etc. for the base period. 
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Value relative: The product of the price and quantity relatives is 
known as value relative since the total value 1$ the product of the 
quantity of goods produced and its price. 


Didi _ Pia 
Po do Родо 


Value relative — 


6.4 SIMPLE AGGREGATE METHOD OF PRICE INDEX 
This is defined as the ratio of the sum of the various commodity 
prices for given year to the total commodity prices during the base 
year expressed as a. percentage 


Simple aggregate price index — E x 100 
о 
where Sp, = Sum of commodity prices in year ‘1’ or current year 
and Sp, = Sum of commodity prices during base period. 
This method is unsuitable since the units of various commodities 
with regard to price may be different and further the relative impor- 
tance for the different commodities are not looked into. 


Simple average method: In this method the price relatives are ob- 
tained for various commodities initially and the average (Arithmetic 
mean, Geometric mean, Harmonic mean or median) of these price 
relatives is known as the simple average of relatives. 


Hence, 


Ра Х 100 
Simple A.M. of relative price index — Po = 
This method is an improvement over the simple aggregate method 
as far as the units of price of various of commodities are concerned 
but still asin the previous method, are equal weightage is not given 


to various commodities. 


6.5 WEIGHTED AGGREGATE METHOD 

The demerit in the simple aggregate and simple average of relatives 
method of not taking into considerations the weights for the price 
of various commodities is considered in this method. The price for 
various commodities are weighted by an appropriate factor namely 
the quantity of the commodity consumed or sold during the base 
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year. Thus the importance of a specific or various commodities are 
indicated in this method. 


The following are the various weighted aggregate price index 
methods used is practice. 


1. Paasche's index 


This is defined as the ratio of the weighted averages or aggregates 
of prices to the weights which are the quantities produced or con- 


sumed in the current year expressed as a percentage. 
Thus 


MIU 100 
рода 


where р; is the price during current year 


01 


Po is the price during base year 
41 is quantity consumed ог produced during current year. 


2. Laspeyres? method 


This is the ratio of the weighted averages or aggregates of prices to 


the weights, which are the quantities consumed or produced during 
the base year. 


_ Уруй, 
a= 5522 


х 100 
Фродо 


where 4, is the quantity consumed or produced during base year. 
Note that the weights refer to base year in this method whereas in 
the Paasches’, the weights refer to the current year, 


3. Marshall-Edgeworth method 


This method is a slight improvement over the other two. Here the 


Arithmetic mean of base year and current year quantities are used. 
Thus 


12 23 (qo + 4)р} x 100 
H 24 (do + 9:/ру ` 


It can be observed from these methods that the Paasche’s index 
Shows the cost of production or consumption in the current year as 
compared to the base year but Laspeyre's represents the cost of 
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maintaining the same rate of consumption or production as in the 
base year at current year prices. In the case of Paasche's index 
since the current year quantities are considered, due to the cost of 
the commodity being higher during the current year the actual con- 
sumption may be less and the index normally can show a down- 
ward trend, whereas reverse being the case in the case of Laspeyre’s 
wherein an upward trend is seen due to considering the base year 
quantities. This is slightly overcome by the use of Marshall- 
Edgeworth method wherein the average of quantities for base and 
current year is being used. 


4. Fisher's ‘ideal’ index number 
This is defined as the geometric mean of the Paasche's and Laspeyre's 


index numbers. Thus 


Урлах у (ZP1to 
Paz ЛЕ“) (2%) х 100 
из Уруд / V9Podo 
This method is harder to compute than the others, added to this, 
it requires the quantities for both current and base years which is 


difficult to obtain also as data collection is far more difficult and 


heart-breaking than data processing. 
In all the above mentioned systems the quantities are the weights 


for computing the price index numbers and alternatively if prices 
are taken as weights, we obtain quantity index numbers. 


: 5 
Thus, Paasche’s quantity index number (Qa) = ET x 100 


"E. X 
Laspeyres’ quantity index number = “Pott .. 100 
=P oo 


Using these, the monetary value of the commodities consumed or 
produced can be related or compared over a period of time. 


6.6 TEST FOR INDEX NUMBERS 

The index numbers obtained for individual commodities are likely 
and expected to reflect the index numbers for a group of commodities 
at least theoretically, The following are some of the tests used in 
practice. Those index numbers which fulfill the property are said to 
meet the particular test. 
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1. Time reversal test 

ЇГ two periods are interchanged, the corresponding price relatives 
are reciprocals to each other. This is a test for consistency of an 
index number, which is worked out both forward and backward 
with regard to time. 


Consider the price index number method (Paasche's) P4, = 20,4, 


роду 
If the periods are interchanged Ру, = pots 
Зрлф 
Time reversal test states that Py, = x 
ог Tox pst 


Paasche's index number does not satisfy this test since 


Ўр x Zpodo 
Хр: ^ ®ру% 


In the case of Laspeyre's index number 


Ра interchanging the periods 


Уруд, 
oo 


ix Doh 
" Ina 
The products of these namely (359) (еа 
aT радо) AZpyds 
and hence this index also does not satisfy the time reversal 
property. 
For the Marshall-Edgeworth form 


Jis not equal to one 


_ Хад, ар 


EN = 204 + 40) Po 
a= gF hn) И 


— За Р, 


(4 + фора , За + до p 
Hence Pu Pio = <= 0 МЕ. д 0220 
ЕЕ 4)Ро (4 + Чо) р: 


Hence Marshal-Edgements formula satisfies the time reversal test. 
Finally in the case of Fisher's ‘ideal’ index method 


1 


Py х Ра = (2966. ido . y Урьд | Брод 
рой: Zpodo Epi, 
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Thus, Fisher's ideal index number also satisfies the time reversal 
property. 


2. Factor reversal test 

The index number obtained by interchanging the factors namely 
prices and quantities, in price index number formula, when multi- 
plied by the original index number should be equal to the value 
index number 


Ра · да = Voi 
Consider the Laspeyre's index Ра = d if thefactors are changed 
040 
> Уду Po 
017 У%р 


Thus the product Ри Оз is not equal to the value and hence 
Laspeyre's index does not satisfy the factor reversal test. Similarly 
it can be seen that Paasche's index also does not satisfy this test so 
also the Marshall-Edgeworth index. 


Зрафо Уруй, 


For Fisher’s ‘ideal’ index number P, = ЈЕ 40 Хро 
0 ofi 


Ур Хр; 
Уб Po Хр. 
Урд Хр Хро Уруй 
podo Ура Wolo Фра 


Interchanging the factors we get P'o, or Qo, = 


Hence the product Р-Р’, = 


— 22141 — ид, the value index number 
ZDodo 


Fisher's ideal index number meets the property of factor reversal 


test. 


3. Circular or cyclical test 
This is only an extension of the time reversal test of shifting the 


base periods. 
Thus Po, X Ра X Ра X Ри =1 


Evidently Laspeyres, Paasche's and Marshall-Edgeworth index num- 
bers does not satisfy this property. 
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Consider Marshall-Edgeworth index as an illustration 
Ра 4). — Pld + Ф). Pay = 2291 о) 
9 "Xp(do--d) * ра +-9)° ^ 5р9. + 40) 
Thus Роу`Руз`Ро == 1 
For Fisher’s ideal index number 
рдо Уруй, | (род 2PeGa | (род. Epod. 

Р P,P. = (е. 191 2Ч: ^РаЧа ода Doo 1 

Ee polo ®ройу ) \2Pi91 Epid, ] \®рьй„ Ур, 2 


Hence none of the index numbers mentioned here satisfy the 
circular test. 


P 


6.7 EXAMPLES 
1. The average retail price of cement, per bag (50 kg) in 
Bangalore during 1976-83 are given below: 


Year: 1976 1977 1978 1972 1980 1981 1982 1983 
Price: 58.00 63.00 62.00 64.50 65.00 75.50 88.10 7800 


Obtain the price relatives (a) for 1982 and 1983 using 1976 as base 
and (b) for ali the years using 1976-78 as base. 


(a) Price relative 1982 = Pice in 1982, 100 _ 88-10 151.90 


Price in 1976 58.00 
д А 78.00 
Price relative for 1983 = 38.00% 100 = 134.48 


(b) The base year. price 1976-78 is the A.M. of the prices for the 


3 lyears = eee = 61.00 


Hence, Price relatives for 1976, 1977........................... 1983 are 


95.08, 103.28, 101.64, 105 74, 106.56, 
123.77, 144.43, and 127.87 


2. The following is the information pertaining to total food pro- 
duction for various years in India. 


Year i 1980 1981 1982 1983 1984 
Food production: 
(million 


metric tons) 142.00 147.00 145.10 146.60 156.80 


INDEX NUMBERS 145 


Obtain the quantity relatives with 1982 as base and 1980 as base. 
Quantity relatives with 1982 as base are: 


Year : 1980 1981 1982 1983 1984 


Quantity 


relatives 
(1982=100) 97.86 101.31 100.00 101.03 108.06 


whereas the quantity relatives with 1980 as base аге 


Quantity 
relatives 
(1980— 100) 


3. During September 1984 a college students’ hostel incurred а 
mess bill of Rs. 21,770 for 140 boarders. In October 1984, the hostel 
had 40 more boarders and incurred addition of Rs. 5050. Find the 
quantity and the value relatives using September 1984 as base. 


ХАРЖ 
Quantity relative = Dr » 100 = 128.57 


100.00 103.52 102.18 103.24 110.42 


4 21770 + 
Value relative = 21270 + 505 х 100 ='123.20 


value relative 


Hence the price relative = quantity НО A 


128:20Ж 


= 17957 100 = 95.82 


Thus 95.82 is nothing but the cost towards mess per boarder for 
October 1984 with September 1984 as base period. 
The same can be interpreted as follows: 


А 21770 
Mess bill per boarder in Sept. 1984 — “2027 Rs. 155.5 


| 26820 
Mess bill per boarder іп Oct. 1984 = “үд = Rs. 149 
elative for Oct. 1984 with Sept. 1984 as base 


_ 149 
= 155.50 Х 100 = 95.82 


Price r 
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4. The following data shows the average price of certain dairy 
products in a town for the years 1979, 1980, 1981, 1982 and 1983. 
Compute the simple aggregate price index for these products for 
1983 using 1979 as base and 1979-81 as base. 


Dairy Price per kg 

products 1979 1980 1981 1982 1983 
Milk 2.60 2.90 2.90 3.10 3.40 
Buttermilk 2.20 2.70 2.75 3.00 3.10 
Toned milk 2.50 2.65 2:70. - 2190, 2.90 
Butter 24.00 26 00 31.00 28.00 32.00 
Ghee 32.00 36.00 38.00 40.00 41.00 


Simple aggregate price index (1983) with 1979, base 
2р 100 3.40 + 3.10 + 2.90 + 32.00 + 41.00 х 100 
Ем ~ 2.60 + 2.20 + 2.50 + 24.00 + 32.00 


82.40 
= 6330 > 100 = 130,17 


The simple aggregate price index for 1983 with 1979-81 as base is 
computed initially by computing the average price per each item. 


Average milk price (1979-81) = 2.60 290 + 2.90 


= 2.80 
Average buttermilk price 272155 
Average toned wilk = 262 
Average butter = 27.00 
Average ghee = 35.33 
Total 70.30 


Simple aggregate price index for 1983 (with base 1979-81) 


82.40 
= 7030 Х 100 = 117.21 
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Instead of considering the aggregate prices, the average of relatives 
can also be obtained. Thus 


Price relative for milk for 1983 (base 1979) — ES X 100 = 130.77 
5 а Е 3.10 
Price relative for buttermilk for 1983 (base 1979) — 550 Х 100 
= 140.91 
: У : 2.90 
Price relative for toned milk for 1983 (base 1979) — 250 ~ 100 
== 116.00 
5 : 32.00 
Price relative for butter for 1983 (base 1979) = 54-00 ~ 100 
= 133.33 
Pri i 1983 (base 1979) Иг х 100 
rice relative for ghee for (base = 32.00 
= 128.125 


The A.M. of the price relatives 
= 130.77 + 140.91 + 1576 + 133.33 + 128.125 — 129.83 


The G.M. of the price relatives 
= 5413077) (140.91) (116:00) (133.33) (128.125) = 129.5 
г the price relatives is 130.77. 
le index number to the following data by 


whereas the median fo: 
5. Obtain a suitab 


taking 1977 as base year: (ЕРЕ РО 
Commodities A B С D 
1980 5 1.5 2 3 
Price in 
1977 3 4 3 2 
Quantities | 25 32 8 5 
in 1977 | 
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iti Price uantit 
Commodities 1977] 1980 Опа (2) Ру do Ро 40 
(Po) | (р) 
А 3 5 25 125 75 
B а | 15 32 48 128 
с 3 2 8 16 24 
D 2 3 5 15 10 
Total 204 237 
237 
Laspeyre's index — mA X 100 = 204 x 100 = 116.18 


0 10 


Since the quantities for the year 1980 is not provided Laspeyre's 
index is the most suitable one. 


6. From the following data compute the Laspeyre's and Paasche's 


index numbers. 


Quantities Prices 
Commodities 
base year | current | base year current 
year year 
A 12 15 10 12 
B 15 20 7 5 
С 24 20 5 9 
D 5 5 16 14 


(PUC—April 1978) 


Denoting the columns by 4%, q,, Po and p, respectively the pro- 
ducts p,q,, Род» P149, and род, are obtained thus: 
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0/47 | OFS | St sos 16301, 

08 [UA 08 OL pI 91 5 С а 

001 | O81 | OZI | 912 6 5 0с 144 о 

ОРТ | 001 | SOT | SL 5 L 02 SI Я 

051 |081 | 021 | Ри! (4! 01 SI [4! У 

та ‘а їр "р 
Ud | ‘hd | "bd | % та | avag jua4in2 | doad 2504 | 4024 1u244n2 4094 2504 
зәтдирошшогу 


52214 


52 пирпо 
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А Ур, 4 
Laspeyre's index number = = х 100 
psy Z po do 


505 
—225 x 100 


— 118.82 


Paasche's index number — Хр; х 100 
Z po 4 


530 
= 176 * 100 


= 112.77 


7. Compute index numbers of Paasche, Laspeyres, Marshall-Edge- 
worth types for the following data: 


Commodity Base year Current year 
quantity price quantity price 
A 30 8 35 6 
B 15 5 20 8 
С 25 6 20 Ч. 
р 20 2 15 3 
E 30 3 35 2 
Е 10 7 15 9 
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ОРА $89 $99 09L тој, 

SOI 06 OL sal 6 SI L ol d 

SOI 09 06 OL [4 Se € 0Е 4 

06 09 Ob $ $ SI [4 02 а 

021 SLI 061 ОРТ L oz 9 St о 

001 ос SL 091 8 oz $ SI 4 

082 081 055 о 9 СЕ 8 0Е У 

троа 914 оро та in ХН | a nes 

«прошшо? 
avad игл) pak 25084 
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Laspeyres' index — 2 x 100 


=o У 100 


= 103.01 


Paasche’s index = 22 a x 100 


_ 760 
~ 740 


= 102.70 


x 100 


: _ 22190 + 41) 0 
Marshall-Edgeworth index — СЕ х 10 


— Срб + 22,4) .. 199 
— (род + 2 Яг Doh) | 
_ 685 + 760 
~ 665 + 740 


_ 1445 


= 102,85 


х 100 


8. Calculate Fisher’s ideal index number for the following data: 


' Base year Current year 
Commodity Price Quantity Price Quantity 


А 20 4 24 5 
В 15 5 24 3 
c 30 2 12 5 
D 50 1 50 2 


(РОС. March 1977) 
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Fishers' ideal index number -( рф у 358.) х 100 
$ родо Z pod; 


= = E х 100 = 4/0.8881 х 100 

= 94.24 
9. Show that the Marshall-Edgeworth index satisfies the time 
reversal test but not the factor reversal test using the following data. 


Base year Current year 
Commodity Price Quantity Price Quantity 

(Ро) (90) (ру) (44) 

А 6 50 10 56 
В 2 100 2 120 
С 4 60 6 60 
D 10 30 12 24 
E 8 40 12 36 


Using the data, the following products are obtained: 


Commodity Pido Polo Pit Pot 
A 500 300 560 336 

B 200 200 240 240 

(еј 360 240 360 240 

р 360 300 288 240 

Е 480 320 432 288 
Total 1900 1360 1880 1344 


Marshall-Edgeworth index number (Р) 


Ур, (do + 41)  Zp,q,-- Ep,q, 1900 + 1880 
Хр, (do + a) родо + Ур, 1360 + 1344 


= 1.3979. 
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If the periods are interchanged 


1344 4- 1360 


1880 4 1900 ~ 0.7153 


Pa = Epo (ф +40) — Үройц + род — 
Уру (ha 24) Ра: + ХР: 


Hence Ри x Py = 1, 50 that the time reversal test is satisfied by 


this index number. 
Interchanging the factors name 
4 + Хр * 1344 + 1880 MÀ 3220 
1360 + 1900 3262 


= 0.9877 


ly prices and quantities 


Qu = Уа, (Po Tn) 2 
a — Ба, (Po + Ро Ўро + ZIPs 


For satisfying the factor reversal test, these products should be equal 


to the value (Pa) (Qo) = Vor 
LH.S. = (1.3979) (0.9877) = 1.3807 


= hea 1880 _ | 3824 
Zpodo 
|-Edgeworth index does not satisfy th 


But 
Hence, Marshal e factor reversal 
test. 

10. Prove using the dat 
reversal test is satisfied by 


а given under problem (9) that the factor 
Fisher’s ideal index number. 


е ЧИА, pu 2900) 
Fisher’s ideal index (Ри) = ( руй, 0h 
From the products obtained for the previous problem we have 
1900 1880 = 
oe АДА АД 9542 == 
Ра = „|1360 1344 4/1.9542 = 1.3980 


UL SA D. оз ог 
NEL ш 1344 1880 _ ./0.9778 = 0.9889 
d 27 24оРо ZqoPi 1360 1900 
: Бр, — 1880 1.3824 
The value index Vor = 277 и 
— (1.3980) (0.9889) — 1.3824 = Vn 


Ра' Ди 
factor reversal test is satisfied by the Fisher's 


which shows that 
ideal index. 


156 PRE-UNIVERSITY STATISTICS-II 


Alternatively this can be obtained by the following method: 


Xp. Уруй, Xap. Xa. _ [1900 1880 1344 1880 
Pa: On E * 


Зробо Брода ZoPo 4оР1 1360 1344 1360 1900 


(1880? _ 1880 


= | (13603 = 1360 — 12924 


which is nothing but the value index (Ууу). 


68 COST OF LIVING INDEX NUMBERS 
Cost of living index numbers are nothing but a measure to assess 
the changes in cost for specified items consumed by a group of 
people in the current year as compared to the base period. 
Basically the cost of living index is an index number of prices and 
these prices are for the commodities consumed by a defined class 
of people, like industrial workers. middle class people and so on. 
The standard of living can be established from the cost of living 
index number. It should be noted clearly that other factors like in- 
come of family, house hold size etc. are not considered as far as 
the standard of living concept in this context is concerned. The 
standard of living refer only with regard to purchasing of a fixed 
set of goods and the costs are only indicated. If the cost of living 
index number for the current year (1984) be 574 as compared to 
the base period 1961, to maintain the same standard as of 1961, the 
income should also be increased to 574 for the current year. If an 
employee were to get the same wages of 1961, he would be handi- 
capped and face financial stress and hence to be paid wage with 
which he can live and hence the government has as a matter of 
policy and are committed to grant dearness, interim or any other 
such allowance to the employees to cope up with rising costs. 
Some of the factors which are to be borne in mind while compil- 
ing the cost of living index numbers are specification of class of 
people which are refered to and the region which it refers to, the 
items of commodities to be included, information regarding price, 
whether wholesale or retail and also the weightage for various items. 
The cost of living index numbers аге compiled usually for various 
categories like industrial workers, working class, low salaried work- 
ers and the like, Again these may be computed for a city, state and 
an industrial township too and hence it is essential to specify the 
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place as well as people whom it has been compiled. After deciding 
about the place and people, the next thing is to ascertain the various 
items that are to be included. The items comsumed are usually ob- 
tained by family budget enquiry which is nothing but a sample 
survey wherein data are collected on the family size, number of 
dependents, amounts spent on various items like food, clothing fuel, 
and light, house arent and miscellaneous. Under miscellaneous, the 
items not covered under the other heads like expenditure towards 
education, amusements, medical charges, savings etc. are included. 

The next step is devision of say workers into various strata on 
the basis of their income and information are collected separately 
for the various strata with respect to consumption pattern to obtain 
the cost of living index or consumers' price index numbers. Finally 
the retail prices of the items have to be collected from official sources 
both for the current and base periods for the purpose of computing 
the cost of living index number. The unit of measurement, quality 
ею be specified. It is not advisable to refer to average 
rage expenditure towards education, but actuals 
orkers of the family budget enquiry is to be 


of articles ar 
rent per house or ave 
for various set of w 


gathered. 
The cost of living index number can be computed by either (a) 


nditure method or (b) Family budget method. The 
aggregate expenditure is nothing but the total expenditure towards 
various items. The ratio of the aggregate expenditure in the current 
year (Ep14) to the aggregate expenditure towards various items in 
the base year expressed per 100 is known as the cost of living index 
number. In other words, Laspeyre’s index number is the aggregate 
expenditure method. 

In the case of family budget method, the price relatives are 
weighed by the amount of expenditure incurred by the families for 
various items. The money spent by the family is reflected in this 


method. 


Aggregate expe 


(Epoo): (2 100 ) нр 
Cost of living index number — Spode - 2 
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Example 
Quantity Price per unit 
S. Мо. Пет Unit Consumed- Base year Current year 
base year (1971) (1983) 
1. Rice kg 20 2.10 4.50 
2: Dal kg 4 3.50 8.00 
3h Sugar kg 5 1.75 3.85 
4. G.N.Oil kg 3 8.50 17.50 
55 Kerosene litre 10 1.10 2.30 
6. Firewood quintal 1 31.00 80.00 
Л Dhoti pair 1 25.00 56.00 
8. Saree 1 4 45.00 100.00 
9; House rent 1 1 120.00 300.00 
Aggregate expenditure method 
Quantity Base year Price 
Items (4) (Po) current year Рудо Родо 
(n) 
Rice 20 2.10 4.50 90.00 42.00 
Dal 4 3.50 8.00 32.00 14.00 
Sugar 5 1.75 3.85 19.25 8.75 
G.N. Oil 3 8.50 17.50 52.50 25.50 
Kerosene 10 1.10 2.30 23.00 11.00 
Firewood 1 31.00 80.00 80.00 31.00 
Dhoti 1 25.00 56.00 56 00 25.00 
Saree 4 45.00 100.00 400.00 180.00 
House rent 1 120.00 300.00 300.00 120.00 
Total 1052.75 457.25 


The cost of living index number for 1983 with base year 1973 


= 100 is 221% „ 100 = 
Хродо 


1052.75 


457.25 ^ 


100 = 230.23 
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105274.55 
C.L.I. = Bis = 230.23 


2. Calculate the cost of living index for the following data: 


| P a 


Items | Food Clothing Rent Fuel Miscellaneous 
| 
Percentage 42 25 14 6 13 
expense 
| 
| 
Index [125 95 100 80 90 
| 


(II PUC—April 1984) 


Let the total expenditure be 100. The index numbers 125, 95 etc. are 
as seen has been obtained with some base period for which it was 
taken as 100. 

The cost of living index is 


42x 1254-25 x 954-14» x 
«125+ + оваа цз x90 сој 40655 


6.9 COST OF LIVING NUMBERS — USEFULNESS AND 
LIMITATIONS THEREOF 

l. Since the cost of living index number reflects the changes in 
prices, this may be used by the policymakers to use as an indicator 
for price policy and also helps to take corrective measures in eco- 
nomic activities like tax, relief and so on. 

2. The dearness allowance are increased on the basis of the cost 
of living index number and the wholesale price index for October 6, 
1984 (base 1970-71=100) being 340.9 as an example also illustrates 
the utility of the index to measure the prevailing cost. 

3. The cost of living index number can also be used to evaluate 
the purchasing power of money, like when the prices increase faster 
than wages the real value of wage earner's income is said to have 
fallen. Alternatively, if the prices fall faster the real income has 
fallen. The process of obtaining the real income or real earnings 
which is the ratio of say real earnings to the cost of living index 
number. This is known as deflation. 
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ng is considered as standard of living which 
or possible to measure. Though the cost 
of living index indicates the price situations over à period of time, 
it is a difficult thing to compare the same for two groups in the 
same region since the wants, tastes, habits and class varies between 


such groups. 

The consumption раќ 
vided in the cost of living in 
sumptions and prices are use! 


: Usually the cost of livi 
is neither easy to define n 


ern may change and allowance are not pro- 
dex computation since base year con- 
d for later years also. 


6.10 EXERCISES 


‘Index numbers’ and indicate their uses and limitations. 


1. Explain 
quantity and уаше index number. 


2. Distinguish between price, 
е current year is 125, what is your 


3. Ifa price index for th 
(II PUC-—Apr. 1984) 


conclusion? 
4. Write down any two limitations of Index numbers. 
(П PUC—Apr. 1984). 


ber. State the various problems involved 


5. Define an index num 
п РОС—Ари 1978) 


in the construction of index numbers. 
6. Discuss the problem of the construction of Index number of 
wholesale prices with special reference to selection of base 
period, selection of commodities, selection of the type of 


average and weighting. 


7. Do youthink that 4 chai | 
index in computing an index of prices? 


8. Why is it necessary to shift the base period of an index number 


from time to time? 
9. Describe Laspeyre 5 and Paasche's method of weighting index 


relatives. Which one would you prefer and why? 


10. It is sometimes stated that Laspeyre's index tends to overesti- 
mate price changes whereas Paasche's index tends to under- 


estimate price changes, Do you agree with this or not. 


Substantiate. 


11. Define (a) Laspey™ 
ideal index numbers. 


in base index is superior to а fixed base 


e's (b) Marshall-Edgeworth and (c) Fisher's 
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12. Construct the price index number for 1984 and 1983 from the 
following data using 1982 as the base year: 

Price per kg 

Commodity 1982 1983 1984 

-1, Rice 4.00 3.95 4.20 

2. Sugar 4.55 4.60 4.85 

3. Ragi 1.40 1.50 1.60 

4. Wheat (Bansi) 2.20 2.35 2.45 

5. Coconut oil 22.00 26.00 31.00 

13. In 1984, the average price ofa commodity was 15% more than 
in 1983, 18% less than in 1982 and 40% more than in 1985. 
Reduce the data to price relatives using (а) 1983 and (b) 1984 
as base. 

14. The link relatives for prices in 1977-1982 were 130, 121, 138, 
141, 171, 186. Find the Price relative for 1979 with 1976 as 
base. 

15. 


Compute Laspeyre's and Paasche's index numbers for 1980 with 
1979 as base. 


Commodity 1978 1979 1980 


Quantity (kg) Price (Rs) 


1978 1979 1980 


A 14 26 21 52 145 103 
B 27 15 4 65 48 74 
с 2 7 3 18 85 102 
D 11 3 5 26 34 45 


16. 


Calculate the price index number of 1984 by (a) Laspeyer's 


(b) Marshall-Edgeworth and (11) Fisher's from the following 
data; 
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Commodity A B G D 
1984 Price 8 4.50 6.00 11.00 
Quantity 2 4 1.00 3.00 
1974 Price 19 26.0 18 29 
Quantity 3 4 5 2 


17. Explain the various mathematical tests for an ideal index 


number. 


18. Describe time reversal and factor reversal tests w.r.t. Fisher's 


index number. 


19. Construct Fisher's ideal index number for the following data 
and show thatit satisfles the time reversal as well as factor 


reversal tests. 


Commodity Base year Base year Current year Current year 
price quantity price quantity 
А, 3 35 10 40 
A, 9 41 11 30 
As 7 26 6 21 
А, 6 32 12 25 
А, 11 42 14 16 


20. Calculate Laspey: 


the following data. Examine which of th 


reversal and factor reversal tests. 


re's, Paasche's and Fisher's ideal indices for 


e indices satisfy time 


Base year (1971) 


Commodity Current year (1985) 
Quantity Price Quantity Price 
po 100 kg 435.00 lkg 2.05 
Wheat 5 kg 13.00 100kg 121.00 
Sugar 10 kg 48.20 100kg 230.00 
Coconut 1000 3500 1 1.20 
Groundaut oil 10 kg 152.88 15 kg 48.25 


164 


23: 


24. 


25: 
26. 
27. 
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. Explain the difference between fixed base and chain base methods 


of constructing index numbers. 


. From the following data construct fixed base and chain base 


index numbers. 


Year Food production 
(million tons) 


1974 114.1 
1975 117.2 
1976 119.6 
1977 123.4 
1978 127.6 
1979 130.1 
1980 129.6 
1981 133.3 
1982 128.4 
1983 151.1 
1984 153.5 


From the chain base index numbers given below, 


prepare fixed 
base index numbers 
Year : 1975 1976 1977 1978 1979 1980 1981 
Index: 100 140 130 160 180 200 150 


Prove that the Fisher's ideal index satisfies the time reversal 
test. 


Explain circular test with an example. 
Explain how the cost of living index numbers are calculated. 


What are the merits and limitations of a cost of living index 
number? 


Explain the method of constructing cost of living index num- 
ber for the working class in Bangalore. 


The following table gives the prices and quantities of several 
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items consumed by a typical family. Calculate the cost of living 
index for 1984 taking 1971 as base year. 


1971 1984 
eee СЕ MIR Ale ЖЕРЕ 
Items Price Quantity Price Quantity 
Food articles 1,90 45 3.85 38 
Clothing 4.10 7 42.00 9 
Fuel and light 8.25 14 56.25 10 
House rent 140.00 1 650.00 1 
Education 15.10 3 78.00 3 
Recreation 2.25 4 62.00 5 
Miscellaneous 16.00 11 114.00 8 


g index for 1985 on the basis of 1962 


30. Calculate the cost of livin 
ate expenditure method 


from the following data usingthe aggreg 
as well as family budget method. 


De Die RAIL LI Am 
Price (Rs.) 


Article Quantity 
1962 1985 


consumed in 1962 


EUER euo T 


Wheat 8 kg 1.10 2.60 
Rice 30 kg 1.25 4.35 
Pulses 4kg 0.90 7.50 
Milk 30 litres 0.45 3.10 
= re n AME 
Salt 2 kg. . ! 

Oil 5 kg. 2.50 15.15 
Clothing 10 metres 1.20 17.20 
Firewood 40 kgs 0.15 0.60 
Kerosene ] tin 5.00 64.00 
House rent house 100.00 500.00 


о ии 
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. Explain the difference between fixed base and chain base methods 


of constructing index numbers. 


. From the following data construct fixed base and chain base 


index numbers. 
Year Food production 
(million tons) 
1974 114.1 
1975 117.2 
1976 119.6 
1977 123.4 
1978 127.6 
1979 130.1 
1980 129.6 
1981 133.3 
1982 128.4 
1983 151.1 
1984 : 153.5 


From the chain base index numbers given below, 


prepare fixed 
base index numbers 


Year : 1975 1976 1977 1978 1979 1980 1981 
Index: 100 140 130 160 180 200 150 


Prove that the Fisher's ideal index satisfies the time reversal 
test. 


Explain circular test with an example. 


Explain how the cost of living index numbers are calculated. 


What are the merits and limitations of a cost of living index 
number? 


Explain the method of constructing cost of living index num- 
ber for the working class in Bangalore. 


The following table gives the Prices and quantities of several 
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items consumed by a typical family. Calculate the cost of living 
index. for 1984. taking 197]. as. base year. 


ций 1984 
Items Price Quantity Price Quantity 


Food articles 1.90 45 3.85 38 
Clothing 4.10 7 42.00 9 
Fuel and light 8.25 14 56.25 10 
House rent 140.00 1 650.00 1 
Education 15.10 3 78.00 3 
Recreation 2.25 4 62.00 5 
Miscellaneous 16.00 11 114.00 8 


30. Calculate the cost of living index for 1985 on the basis of 1962 
from the following data usingthe aggregate expenditure method 


as well as family budget method. 


са GNI I о аи 


Article Quantity Price (Rs.) 
LE M UI Tee Mm 
consumed in 1962 1962 1985 
Wheat 8 kg 1.10 2.60 
Rice 30 kg 1.25 4.35 
Pulses 4kg 0.90 7.50 
Milk 30 litres 0.45 3.10 
Sugar 5kg. 1.40 4.00 
Salt 2 kg. 0.12 0.50 
Oil 5 kg. 2.50 15.15 
Clothing 10 metres 1.20 17.20 
Firewood 40 kgs 0.15 0.60 
Kerosene 1 tin 5.00 64.00 
house 100.00 500.00 


House rent 
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31. Compute the consumer price index from the following data: 


Price 
Items Base year Quantity Base year Current year 
Food 18 2.05 3.85 
Clothing 3 1.20 16.00 
Fuel and light 11 0.80 1.50 
Кепї 1 150.00 600.00 
Miscellaneous 6 5.00 15.00 


32. The average monthly salar 
1950, 1960, 1970, 1980 and 1 
and 1800 whereas the consi 


salaries. 


umer prices for these у 
260, 371, 570 and 582 respectively. Find the rea 


y of lecturers of a college during 
985 were Rs. 75, Rs, 230, 420, 1300 
ears were 100, 
1 values of the 


7. Time Series 


71 INTRODUCTION 
А set of observations taken during specified times, usually with equal 
intervals is known as time series. It is a study of two variables one 
of which is about time. The daily closing price of a commodity, the 
weekly temperature of a place, monthly sales in a super bazaar, annual 
food production and decennial (census) population are some of the 
examples of time series. It is evident from the nomenclature that 


a series of values of a variable arranged either week by week, month 
time series. It is also defined 


by month, or year to year is known as 
by the values of the variate y (food production, time, sales, price 
etc.) at times 1, so that y is a function of t [i.e. у = ДО). 


7.2 USE OF TIME SERIES 
f characteristics, the 


The analysis of movements ог variations О 
pattern of which throw some light for future movements and hence 


time series analysis is very useful to forecast business conditions. 
The future needs for a city or town like laying of water and sewerage 
pipes, establishment of shops, banks and other such necessities 
depend on the knowledge of growth rate of the population. Some 
of the components of the time series like seasonal variation and 
cyclical variation indicates businessmen either to increase or decrease 
supply on the basis of the seasonal fluctuations and also for provid- 
ing them to review the stock, improve the establishments by either 
repair or renovation so as to be prepare well during peak period 
Wherein the business activity is likely to be higher. The movement 
Of people, conduct of elections and other administrative priorities can 
be planned by increasing special trains or preponing of elections and 
the like can be thought of by the analysis of time series data. The 


time series is thus an indicator of the changing pattern both ifover a 


long period as well for short periods. 


73 COMPONENTS OF TIME SERIES 
time’ does not fluctuate uniformly 


. In the time series, the variable 
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and experience shows from past to the present that price changes. It 
can be observed that though time is a factor in the change of price it 
is not the only contributing factor. The other factors like the supply 
and demand factors, monsoon fluctuations and the like cause the 
fluctuation in price. Hence the total variation is sliced into com- 
ponents to show whether the pattern is cyclical, Seasonal, irregular 
or otherwise. Such type of slicing and analysis, helps to study the 
business conditions for a future date (forecasting). 


The following are the four main features or components of time 
series, 


1. Long term movement or secular trend: This represents the rise 


or fall over a period of time, the grrphical presentation indicating 
the general direction. 


2. Cyclical variation or cyclical movements: The cyclical variation 
refers to the long term oscillations. Movements are considered to 
be cyclical if the time interval is more thana year since the oscillations 
in business depend on the cumulative effect of various economic acti- 
vities. Usually the cyclical movements refer to the intervals of busi- 
ness prosperity, depression and the like. 


3. Seasonal movements or Seasonal variation: This is the swing in 
the value of variables like production, sales, price due to changes 
of seasons in a year, Evidently the values in the time series are pro- 


vided hourly, daily, weekly, monthly, quarterly and half yearly which 
are the shorter intervals than provided elsewhere. 


4. Irregular or random movements: The movements of time Series 
due to chance causes like wars, earth quakes, elections, strikes etc. 
are known as irregular movements. All variations not being due to 
trend or cyclical or seasonal can be considered as irregular, {erratic 
or random movements. 


7.4 ANALYSIS OF TIME SERIES 
The analysis of time Series refers to the description of. component 
movements, to isolate these components and interpretation. The 


time series product y is assumed to be the product of the variables 
trend, cyclical, seasonal and irregular movements. 
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Thus у = TCSI 


Thus analysis of time series is nothing but decomposition of the 
time series into the component movements which is nothing but 
study of the factors T, C, S and Г. Sometimes instead of taking the 
Product, the four products are taken as the sum i.e. y = (Trend + 


Cyclical + Seasonal + Irregular) movements. 
The trend or secular trend is nothing 


numerical data to increase or de- 
d the estimation of the trend can 


Estimation of secular trend: 
but the pattern or the tendency of 
Crease over a long period of time an 
be obtained by the following ways: 


(a) Free hand method 

(b) The method of moving averages 

(c) Method of least squares 

(d) Method of semi-average 

ting of trend line or curve 
d simply by looking at the 
thod is unsuitable since 


The free hand method is nothing but fit 
$0 as to pass amongst the plotted points an 
graph an estimate of T is obtained. This me 
it depends too much on individual judgement. 

The moving average isnothing but the arithmetic mean calculated 

y successive method of the order 88У 3 or 5 or 6 or 12 and the like. 
the moving 


For a set of observations Ху, X» X» Ху Хр Хв and Хо 
average of order 3 is given by the series of arithmetic means, 


жа xs Xot Xs+ Xs gteat, X Es 
3 ? 3 ф 
апа хусах 


r3. Usually the 


totals of orde 
lation to 


priate position in те 


The numerator here are called moving 


Moving average is written at its appro 


the original data. 
average for the following 


Illustration: te the 4-yearly moving 
ages es сш (PUC—Oct. 1978) 
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Production 4 Year 4 Year 
HS of tea (mill. 165.) moving total moving average 


1961 49 

62 
em 518 1964 491 
1964 467 2002 500.40 
1965 502 2027 506.75 ' 
1966 540 2066 516.50 
1967 557 2170 542.50 
1968 571 2254 563.50 
1969 586 2326 581.50 
1970 612 


The first moving total is the sum of the first four values i.e. 464 -+ 515 
+ 518 + 467 = 1964, the 2nd moving total is the sum of 2nd to 
the fifth i.e. 515 + 518 + 467 + 502 = 2002 and finally 557 4-571 
+ 586 + 612 = 2326, is the moving total for the last four values. 
The 4 yearly moving averages for these are 


1964 2002 2326 
4 4215, “== 5005, 5285-5815 


and for the others also can be obtained in the same manner, 

The moving average as can be seen from the above illustration 
tends to reduce the variation present in a set of observations and 
hence by using appropriate ordered moving averages, cyclical, 
sonal and irregular movements are eliminated and the trend move- 
ment can be obtained. Some of the disadvantages of this method 


are that as while obtaining in the moving averages (4 yearly) for 10 
years (periods) 7 averages are obtained and further by extreme values 
the moving averages does not provide a true picture; added to these, 


the movements presented by moving averages may be cyclical or 
altogether different from what is 


actually present basically. Hence the 

trend obtained by the moving averages though smooth enough cannot 
be put in the form of a mathematical equation, 

To find the equation of ап appropriate trend line or curve the 


sea- 
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method of least squares is used and thereby the trend values 7 сап 
be computed either by interpolation or extrapolation. 


7.5 METHOD OF LEAST SQUARES 
This is nothing but the principle of best fit which is obtained by the 


sum of the squares of the differences between the observed value and 
the corresponding calculated value is as minimum as possible or 


least. 
Thus Хад should be minimum, where d; is the difference between 


the observed and calculated values. A curve possessing this quality 
is said to fit the data in the least square form and the line posses- 
sing this property is called least square line. 

In the straight line case, the least square line approximate the set 
of points (Хр Y1) C» yy) «++ (Xm Ул) has the equation y; = а + bx, 
where a and b are constants. The least square method implies that 

ZQi—a— bx)? = minimum. 

Let 

S= È (yı — а — bxi)? 


and for this to be minimum 


95 os 
Pur and ap^ 9 
95 


= 2201 a bx) C- 1у=0 


= —2Zy na + bXx, =0 


$ PEL 274 (QD 


and ay БЭРТ А 4) 


Ф = 
Ч m. meus 0 о 
ЖЭЭРЭ дэх : 
Equati Уху = а®ч + ations and solving these 
two Hons (1) and (2) аге called DU ыг Mie can be Кей, 
о 


о 
Simultaneous equations the € 


us 
ху ла bZ 180 
5 ь®х* п 
уху = 1277" E 
Хх, ху! = пам + bx ©) 


n Xxy =n? ет хи (4) 
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Subtracting the 4th equation from the third ‘b’ can be obtained. 
p hix —Ezxy 


Thus пух (Exp (xy 
A ZyZx?-—Zx хуу; 
Similarly а= т = 
Example 
Fit a linear trend for the following data: 
Year: 1965, 1966, 1967, 1968, 1969 
Exports 12 15 22 27 25 
(107000 tans) 
Estimate the export figure for the year 1970. (PUC Oct. 1975) 


Let the equation to the linear trend be y; = а + bx, where x, = year 
1965 (say) 


(у) 
Year Exports x; x? Xi 
(000 tons) 
1965 12 0 0 0 
1966 15 1 1 15 
1967 22 2 4 44 
1968 27 3 9 81 
1969 25 4 16 100 
"Total 101 10 30 240 


The normal equations are 
Zy,—na-- bx, 
Уху = ах, bix? 
Substituting the values for X Yn È Xi, Zxyy, and X x? 
101 = 5a + 106 үх 2 
240 = 10а + 2 
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202 = 10а 4- 20b 
240 = 10a - 30b 


38 = 10b ог 98-38 
5а = 101 — 105 = 101 — 10 (3.8) = 63 
a= 12.4 


Hence the trend line is y; = 12.4 + 3.8 x, 

The export figure for 1970 (1970 — 1965 — 5) 

Export for 1970 (yj) = 12.4 4- 3.8x 5 = 12.4 + 19.0 
— 31.4 — 31600 tons 


The least square parabola: The least square parabola (quadratic) 
approximating the points (х, у), (Хь, Уз)...... (Xm, Yn) is 
yı = a + bx, + сх? 

Instead of two normal equations obtained in the case of a straight 
line, three normal equations are obtained in the case of a parabola 
since three unknowns are to be determined. 

The constants a, b and c can be obtained by solving the following 
three normal equations: 


Zy =па + Вх, + су хе 
Уму = ах Ух? + сух? 
ху =аЎх? +65 х? сх 
Illustration: The following data shows the birth rate in India per 
thousand population: 
Year: 1950 1955 1960 1965 1970 1975 1980 
B.R.: рег 1000 popn. 44 42 40 39 38 35 33 


Find the least square parabola fitting the data: 


Year BR H ху ху хе xp x 


(у) 
ПОЛО Да 396 . 9.-27 ы 
О C94. dó8 — 4 -8 је 
1960 40 —] — 40 40 1 —1 1 
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1965 39 0 - - = = ЭР 
1970 38 1 38 38 1 1 1 
1975 35 2 70 140 4 8 16 
1980 33 3 997 297 9 27 81 
Total 271 0 — 49 1079 28 0 196 


ear — 1965 
ke m 


Substituting the relevant totals in the three normal equations 
271 = 7a + b x 0 + 28c 
— 49 =ах0- 286 + схо 
1079 = 28a + b x 0 + 196c 


їе, Та + 28с = 271 
28а + 196c = 1079 
286 = — 49 


From the first two equations а and c can be obtained 
Та + 282 -271 ) x 4 
28a + 196c = 1079 


28а + 112c = 1084 
28a + 196c = 1079 


а = 38.95 and b =— 1,75 
Thus the required equation is 


У, = 38.95 — 1.75x — 0.0595x2 


Exponential trends: If the y; values follow approximately a geo- 
metric progression, whereas the x; values are in arithmetic progres- 
sion, the relationship between the two is provided by an exponential 
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function. The best fitting curve thus depicts the exponential trend. 
The rate of growth of bacteria, compound interest form etc. follow 
exponential trend. 

Consider the exponential curve of the form y — ab», 

The normal equations can be obtained by first taking logarithms 
on both sides of this equation. 


Thus log у = log a + x log b 
Le. Y= A+ Bx 
where Y = log y, А = loga and B= log b. 


The normal equations for this can easily be obtained 
ZY-nA--BZx 
Үху-АХїх-ВХїх 


Using these two simultaneous equations, 4 and В are obtained and 
by taking antilogarithms a and 5 can be found. 


Example: The number Y of bacteria per unit volume present in a 
culture after Х hours is given in the following table. Fit a least 
square curve having theform Y = ab* to the data. 


Number of hours (X): 0 1 2 3 4 5 6 


Number of bacteria per 
unit volume (Y) 2 #82. 8047. BOD 92 ^ 132 190. 275 


Y — ab* 
log Y = loga+ Xlog b 
i.e. у= 4А + BX 
X Y log У = у x8 Xy 
0 32 1.5051 0 0 
1 47 1.6721 1 1.6721 
2 65 1.8129 4 3.6158 
3 92 1.9638 9 
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4 132 2.1206 16 8.4824 
5 190 2.2788 25 11.3940 
6 275 2.4393 36 14.6358 
Total 21 13.7926 91 45.6915 


The normal equations are 
13.7926 = 7A + 218 x3 
45.6915 = 21A + 91B | 
41.3778 = 21А + 63B 
45.6915 = 214 + 918 


28В = 4.3137 
В = 0.1541 
А = 1.5082 


а = ап оз 1.5082 = 32.14 and b = ап ог 0.1541 = 1.427 
Hence the required equation is 


y = 32.14 (1.427) 


7.6 For measuring the trend, the method of semi-averages is used 
wherein, the data is separated into two groups and the averages are 
taken for these groups and these averages are represented by points 
on the time series graph. By drawing the trend line between these 
average points, the trend values are thus determined. This method 
is applicable only if the trend is linear or at least nearly linear only. 
The trend values can also be obtained without graph as detailed 
below: 


Example: Obtain the trend values to the following data by the 
method of semi-averages. 


Year : 1975 1976 1977 1978 1979 1980 1981 1982 
Production: 47.1 51.6 45.3 41.2 39.6 404 397 38.6 
Year : 1983 1984 
Production: 401 35.2 


БЭ 
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By dividing the data into two equal parts, we have 


Total (1975-1979): 224.8 A. Mean = E — 44.96 
Total (1980-1984): 194.0 A. Mean — = = 38.80 


Note that these two means refers to 1977 and 1982 respectively. 
There has been a decrease of production from 44.96 to 38.80 in these 


6 years. The decrease per year is 22 == 1.03. 

Year: 1975 1976 1977 1978 1979 1980 1981 
Trend 

value: 47.02 45.99 44.96 43.93 42.90 41.87 40.84 
Year: 1982 1983 1984 


Trend 
value: 38.80 37.77 36.74 


The trend value for 1978 — 44.96 — 1.03 — 43.93 
The trend value for 1979 — 44.96 — 2 (1.03) 

= 42.90 and so on 
The trend value for 1976 = 44.96 + 1.03 = 45.99 
The trend value for 1975 = 44.96 + 2 (1.03) = 47.02 


77 ESTIMATING CYCLICAL MOVEMENTS 
OR VARIATIONS 

In business, ıt is common to face a period of prosperity followed 
by depression or vice versa, the occurrence of such periods is noth- 
ing Би cyclical, {һе duration of these cycles may vary from one 
situation to another and isolating of such cycles are done by first 
eliminating the trend and seasonal variation. By eliminating these 
two components the irregular fluctuations and cyclical variations are 
left over. By a relevant moving average the irregular variations are 
smoothened which leaves just the cyclical variations which may be 
interpreted by obtaining the cyclical indices. 


Estimation of seasonal variations or movements 

It has been mentioned earlier that Season; 
oscillation of variables like production, popul 
due to seasons and hence it is necessary to es 


al variation refers to 
ation, prices and so on 
ümate as to how the 
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Fig. 24, 


time series data change from one month to another, one quarter to 
another quarter during a year. The Seasonal variation are to be 
determined since the Production depends on this factor, since it has 
to be planned to know about the variation in the demands over 
various periods in a year. 

The relative values of a variable for all the months, weeks, hourly 
is called seasonal index. These are nothing but the monthly or weekly 
averages for a particular month or week expressed asa percentage 
of the average monthly or weekly of a year. Suppose the production 
of mopeds in a factory were 60%. 110%, 90%, 150%...during January 
February, March and April as compared to the average monthly 
production then these figures are nothing but the seasonal indices. 
This is also called the average percentage method of Computing the 
seasonal index since the values for each Month are expressed as a 


Percentage of the average for a year which results in 12 values which 
is the seasonal index. 


Illustration: Obtain the seasonal index for the follo 


wing data using 
the average percentage method (monthly accidents i 


n a city). 
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The first step is of determining the monthly averages for 1972 on- 
wards upto 1980. 


Year Total Monthly average 
1972 252 21 

1973 288 24 

1974 232 19.33 
1975 233 19.42 
1976 163 13.58 
1977 112 9.33 
1978 182 15.17 
1979 170 14.17 
1980 217 18.08 


From these monthly averages for each year the monthly indices are 


obtained as shown below: 
Year Jan. | Feb. | Mar. Apr. | May 
1972 | 22100—71.43 | 52.38] 128.57 | 195.24; 90.48 
1973 Завь 112.50 | 58.33| 129.17| 108.33 100,00 
1974 19. 1554 Х 100—160, 37 103.47| 82.77] 191.41 155.20 
1975 У 100=82.39 |92.69| 102.99 144.18 211.12 
1976 тон 09 | 73.64| 110.46| 191.46 110.46 
1977 rM 34 | 96.46| 117.90| 160.77| 0 
1978 is а 43 | 39.55| 138.43| 105.47 | 72.51 
1979 14. ip 100—127.03 | 35.29| 119.97| 169.37 127.03 
1980 map 105.09 | 77.43| 160.40! 94.03 116.15 
Total 1120.67 ead 1090.66 | 1360.26 (982.95 
Seasonal 
ТҮ 124.52 | 69.92| 121.18| 151.141109.22 
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Month 

June July Aug. Sept. Oct. Nov. Dec. 
76.19 | 133.33 | 171.43 | 90.48 | 95.24 | 61.90 | 33.33 
116.67 79.17 | 104.17 | 100.00 | 129.17 | 87.50 | 75.00 
160.37 82.17 | 36.21 | 20.69 | 82.77 | 25.87 | 98.29 
72.09 | 102.99 | 97.48 | 108.14 | 118.43 | 36.05 | 30.90 
51.55 81.00 | 44.18 | 117.82 | 132.55 | 22.09 | 81.00 
42.87 | 150.05 | 53.59 | 53.59 | 235.80 | 0 150.05 
13.18 | 158.21 | 72.51 | 125.25 | 105.47 | 92.29 | 138.43 
84.69 | 127.03 | 70.57 | 119.97 | 134.09 | 49.40 | 35.29 
88.50 | 116.15 | 33.19 | 132.74 | 154.87 | 16.59 |105.09 


705.11 |1030.70 | 683.33 | 868.68 нө | то [ 


78.35 


ues | 96.52 


132.04 


43.52 | 83.04 


The seasonal variation can also be estimated by another type 
known as link relative method, wherein the variable for each are 
expressed as a percentage of the previous month and these percent- 
ages are called link relatives as they are linked from month to month. 
If the data are available quarterly, the quarterly figure expressed as 
a percentage of the preceding quarter and obviously the 1st quarterly 
average is taken as 100. From these link relatives, the chain relatives 
are obtained from which the seasonal indices are obtained as detailed 
below for a hypothetical example. 


Compute the seasonal indices by link relatives method 


Year 1971 1972 1973 1974 
Ist quarter 14 25 19 14 
2nd quarter 31 26 24 29 
3rd quarter 27 26 15 25 
4th quarter 16 28 19 16 


6898 09°90T=(89'S)E—P9'EZI #9601 | 1968 | 0099  L9'OCI 69101 9565 зәјепЬ шу 
1901 26121-4(89)0-82861 881 | 6668 12:98 059 (00001 0118 зәрәпЬ pag 
68821  POOST=BO'S—ZL'P9T СЭТ | 212991 | PILOT 26921 00Ф01  ЄРЇСС  зеџвпђ рос 
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For the 1st quarter the chain relative — pu = 122.73 


Hence the correction factor = are io. E — 5.68 


Mean adjusted chain relatives — 100-1 159.04 127,92 E 10560 


= 123.39 
Seasonal index for the 2nd quarter = 1355 х 100 = 128.89 
: 127.92 
Seasonal index for the 3rd quarter = 123.39 x 100 — 103.67 


Seasonal index for the 4th quarter = 10669, х 100 = 86.39 


If the quarterly data are divided by the corresponding seasonal 
indices, it results in what is known as deseasonalised figure. 


7.8 ESTIMATION OF IRREGULAR VARIATIONS 

АП the variations that exist after eliminating trend, seasonal and 
cyclical variations are considered to be unsystematic or random or 
irregular variations. In other words, it is the residual variation after 
adjusting for trend, cyclical and seasonal variations. It is usually 
observed that these variations tend to be negligible. 


79 EXAMPLES А 

1. The following table shows the annual food production in India 
in millions of tons for the years 1974-1984. Construct a 4 year 
moving average and 5 year moving average. | 


Year: 1974 1975 1976 1977 1978 1979 1980 
Food 
production: 114.1 117.2 119.6 123.4 127.6 


130.1 129.6 
(million of tons) 


1981 1982 1983 1984 


133.3 128.4 151.1 153.5 
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2. Obtain the trend values for the following data using the 
method of Semi-averages. 


Year: 1901 1911 1921 1931 1941 1951 
Population: 238.3 252.0 2512 278.9 318.5 361.0 
(in millions) 
1961 1971 1981 
439.1 547.0 684.4 


For the purpose of computing the Semi-averages, by dividing the 
data into two equal parts, the means can be obtained by deleting 
1941. 


Year Population Year Population 
1901 238.3 1951 361.0 
1911 252.0 1961 439.1 
1921 251.2 1971 547.0 
1931 268.9 1981 684.4 
Total 1020.4 Total: 2031.5 
Mean 255.1 : Меап: 507.875 


It is evident that from 1921 to 1971 (5-ten year intervals) the increase 
being 507.875—255.1=252.775 or increase of 50.555 in every inter- 
val of 10 years. Hence trend value for 193] — 255.1 4- 50.555 
= 305.655, 356.21 for 1941 and so on. 


Year: 1901 1911 1921 1931 1941 1951 


Trend value: 153.9 — 204,5 255.1 3057 3562 406.8 


1961 1971 19g 
457.3 507.8 5584 
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3. For the following data, show that the straight line by the 
method of least squares to be y+ 5.11 x = 97.04 
Year: 1976 1977 1978 1979 1980 1981 1982 
Man hourslost: 101 95 68 71 106 81 50 


Let the equation of the straight line be y = a + bx, then the normal 
equations are 


Ху = па 4- 5х and Zxy = аХх 4- Вх? 


Year Мап hours х= year-1976 а ху 
lost (y) 

1976 101 0 0 0 
1977 95 1 1 95 
1978 68 2 4 136 
1979 71 3 9 213 
1980 106 4 16 424 
1981 81 5 29 405 
1982 50 6 36 300 
Total 572 21 91 1573 


The simultaneous equations are: 
7a--21b = 572) x3 
21a + 915 = 1573 | 


21a + 63b = 1716 
21а + 915 = 1573 


Непсе b= — === — 5.1071 and а = 97.04 
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Hence у= 97.04 — 5.11х i.e. y + 5.11x = 97.04 


4. Fita least curve of the form y — ab* to the following data 


y: 15 70 140 250 380 


y = ab* and by taking logarithms 
log y = log a + x log b 


ie. Y = А + Bx, where У = logy, А = log a and В = log b 
and the normal equations are 


УУ —nA + ВУх 
ZxY = АЎх + BXx? 


zd y Y=logy 9 х? 

1 15 1.1761 1.1761 1 

2 70 1.8451 3.6902 4 

3 140 2.1461 6.4383 

4 250 2.3979 9.5916 16 

5 380 2.5798 12.8990 25 
Total: 15 10.1450 33.7952 55 


10.1450 = 5A + 158 
33.7952 = 154 + 558 


Solving these two equations В = 0.3360 and А = 1.021 and ћепсе 
b = antilog 0.3360 а = ап ов 1.021 


b = 2.167 and а = 12.645 
so that у = (12.645) (2.1677) is the required exponential curve. 
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5. Fita parabolic trend for the following data and estimate the 
production for 1985. 


Year: 1978 1979 1980 1981 1982 1983 1984 
Production: 5 6.1 114 205 31.2 37.6 392 


Let the parabolic equation be y — a + bx + cx? 


Year Production x=year-1981 x? xs xi 


xy xy 
0) 

1978 5 -3 9 -27 81 =15 +45 
1979 6.1 22 AU В 167 “12021 244 
1980 11.4 251 Пут ЕТА 114 
1981 20.5 0 0 0 0 0 0 
1982 31.2 FT 1 1 1 31.2 312 
1983 37.6 +2 4 8 16 75.2 150.4 
1984 39.2 +3 9 27 3581 116 3523 
Total 151.0 0 28 0 196 1854 6152 


The normal equations are 
Ху = na + bZx + cZx? 
Уху = aZx + БУХ? + cxx? 
®х?у = adx? + bx? + сух? 
By substituting the totals 
151.0 = 7a + 28c 
185.4 = 28b + 0 
615.2 = 28a + 196c 


from the 2nd equation b can be obtained and by solving the 1st and 
3rd equations a and c are obtained. 
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185.4 
28 
Ta+ 28c— 151 | ит 


b= = 6.62 


28a + 196c= 615.2 
Ада + 196c =1057 
28a + 196с = 615.2 


21а = 441.8 
or а = 21.04 and с = 0.1329 
The equation of the parabola for the given data is 
y = 21.0 + 6.62 x + 0.1329 x? 
The estimate production (y) for 1985 is obtained by takingx — 4 
(1985—1981—4) 
Production for 1985 (y) — 21.04 + (6.62) 4 + (0.1329) 16 
— 49.6464 
6. Below are given the figures of production (in thousand 
quintals) of a sugar factory ' 
Year: 1970, 1971, 1972, 1973, 1974, 1975, 1976 
Production: 77 88 94 85 91 98 90 
Fit a straight line for the given data and tabulate the trend values 
(II PUC-Oct. 1978) 


Year (x) Production (y) u=x-1973 из иу 
1970 77 —3 9 931 
1971 88 —2 4 —176 
1972 94 —1 1 — 94 
1973 85 0 0 0 
1974 9T 1 91 
1975 98 9 4 196 
1976 90 3 9 270 

Total 623 0 28 56 
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Let the equation of the straight line be 
y =a+ bu where и = x — 1973 
The normal equations are 
Ху = па + bru 
Duy = aZu + Би? 
By substituting for Ху, Zu? etc. 


623 = Та + Ь.0 
56 = а.0 + 285 
Hence а = 89 and b = 2 so that the equation of the straight line is 
y- 89 + 2и 
Substituting for = — 3, —2, —1, 0, 1, 2, 3 the trend values for 


1970, 1971 etc. can easily be obtained, 
Thus for 1970, the trend value y = 89 -- 2( — 3) = 83 


1971 у=89+2(—2)= 85 
1972 у= 89 + 2(— 1) = 87 
1973 y= 89 
1974 у=91 
1975 у= 93 
1976 у=95 


7. Determine the three yearly moving average for the following 
data and plot the given values and the 3 yearly moving average on 
a graph. 


Year: 1960, 1961, 1962, 1963, 1964, 1965, 1966, 


Value of imports 120 132 128 140 142 141 150 
(lakhs of Rs.) 


1967, 1968, 1969, 1970 


160 180 178 200 
(II PUC-Oct. 1974) 
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Year Value of imports 3 year 3 year moving 
(lakhs of Rs.) total average 
1960 120 
1961 132 380 ' 126.67 
1962 128 400 133.33 
1963 140 410 136.67 
| 1964 142 423 141.00 
1965 141 433 144.33 
1966 150 451 150.33 
1967 160 490 163.33 
1968 180 518 172.69 
1969 178 558 186.00 
1970 200 


— given values 


--- 3yr. moving avg. 


VALUE OF IMPORTS 


ШИШЕ р 


1960 61 62 63 64 65 66 67 8 69 70 
YEAR 


Fig. 25. 
8. Fit a curve of the form y = AB* to the fol 


which y represents the number of bacteria 
culture at the end of x hours. 


xe 0 1 2 3 4 
| у: 73 91 112 131 162 
| у= АВх 


lowing data in 
per unit volume in a 
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Taking logarithms 
log y = log А + x log B 


y=a+ bx 
where y —logy 
a —logA 
b = log B 
so that the normal equations are 
1 ZY = na + Бїх 
УХУ = aXx-- Бїх? 
x y log y = Y x? x.Y 
0 73 1.8633 0 0 
1 91 1.9590 1 1.9590 
2 112 2.0492 4 4.0984 
3 131 2.1173 9 6.3519 
4 162 2.2095 16 8.8380 
Total 10 - 10.1983 30 21.2473 
10.1983 = 5а + 106) x 3 
21.2473 = 10a + 30b | 
30.5949 = 15а + 30b 
21.2473 = 10a + 30b 
ie. 5a — 9.3476 or a — 1.8695 and b — 0.0851 
A = antilog (1.8695) and В = (0.0851) 
Hence A — 74.046 and B — 1.2165 


The required curve is у = 74.046 (1.2165)* 


9. The following table shows the birth rate per 1000 population 
inIndiain 5 year intervals. Find the least square parabola fitting 
the data. 
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Year:1945, 1950, 1955, 1960, 1965, 1970, 1975, 1980 


В.В: 44, 43, 42, 40, 39, 37, 36, 35 
1985 
33 


Let the equation of the parabola be у = a + bx + сх? 


Year B.R. 
x y Х=х- 1965 XR X Хх“ Ху Ху 


1945 44 —4 16 —64 256 —176 704 
1950 43 —3 9 7227 81 —129 387 
1955 42 —2 4 —8 16 — 84 168 
1960 40 -1 10-1 1 25-40 40 
1965 39 0 0 0 0 0 0 
1970 37 1 1 1 1 37 37 
1975 36 2 4 8 16 72 144 
1980 35 3 9 27 81 105 315 
1985 33 4 16 64 256 132 528 
Total 349 0 60 0 708 —83 2323 


о 


y=a+ bX + cX? 
where X = x — 1965. 
The normal equations are 
Zy-na-d-bZX-rcZX 349 = да + 60c 
УХу=— арх +32 + сх ХЗ; —83 = 60b + 0 
У Х?у= аў, х? У Х%-- с> ха; 2323 = 60а + 708c 
= — 1.3833, solving equations (1) and (3) 
a = 38.8571 and c = — 0.0119 
Hence the least square parabola 


y = 38.8571 — 1.3833 Y — 0.0119 x2 
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10. The following table represents the population ofa town. 


Obtain the trend values by using the relation у = a + bT. 


Year (f) : 1921 1931 1941 1951 1961 
Popn (y) : 80 100 110 120 140 
(in *000) 

t y T —t— 1941 ја Ту 
1921 80 —2 4 —160 
1931 100 — 1 —100 
1941 110 0 0 0 
1951 120 1 1 120 
1961 140 2 4 280 

Total: 550 0 10 140 
y=a+bT 


where T=t—1941. 
The normal equations are 
Ху =na+ bzT; 550 = За + 0 
ZTy-—aZzT--FbzT*; 140 = 0 + 10b 
Hencea = 110 and b = 14. 
The required equation is y = 110 + 14T 
The trend values are: 


for 1921 y= 110 —28 = 82 
1931 y=110 — 14 — 96 
1941 y—110— 0-110 
1951 у= 110 + 14 = 124 
1961 y = 110 + 28 = 138 
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7.10 EXERCISES 


1. 


2. 
3; 


10. 


11. 


13: 


14. 


15. 


16. 


What is a time series? Indicate the importance of time series in 
Commerce and Economics? 

Define a time series. Mention the components of a time series. 
Describe briefly the main components and their nature into 
which the observed value in a time series can be broken up. 


. Define the terms Secular trend, Seasonal and Cyclical variations. 
. What are business cycles? How do they differ from Seasonal 


variations? 


. Describe the various methods of determining the trend in a 


Time Series. 


. State the different procedures of obtaining the trend values of 


a time series and discuss their merits and demerits. 
(II PUC-Oct 1977) 


. Explain the method of moving average in determining trend 


of a time series. Mention its advantages and disadvantages. 
(II PUC-Oct. 1978) 


. Mention the different components of a time series. Give the 


different methods of measuring any one of them. 

(II PUC-March 1974) 
Explain the method of moving average and mention its uses. 
Show how the moving average method of determining the trend 
of a time series is related to the method of fitting curves by the 
principle of least squares. 


. Explain the least squares method of determining trend by 


fitting a second degree curve to the given series. 

(II PUC-April 1978) 
А sequence has (a) 24 (b) 25 and (c) 100 numbers. How many 
numbers will there be in a moving average of order 8? 
Prove pe if every number in а sequence is increased ог dec- 
reased by a constant, the moving average is also i 

, so 

decreased by this constant. МАН 
Discuss the different methods for obtainin 


seasonal variation. 8 measures of 


Calculate 4 yearly moving averages for the following data* 


196 


17. 


20. 
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Year 1961, 1962, 1963, 1964, 1965, 1966, 
Production of tea 
(in million Ibs.) 464 515 518 467 502 540 
1967, 1968, 1969, 1970 
557, 571, 586, 612 
(П PUC-Oct. 78) 


Compute the three yearly moving average for the following 
series: 


Year: 1951, 1952, 1953, 1954, 1955, 1956, 
Net sales: 


(lakhs of Rs) 120, 132, 140, 162, 155, 180, 
1957, 1958, 1959, 1960 


210, 228, 235, 310 
(II PUC-Apr. 78) 


- Describe the procedure of obtaining the trend values by means 


19. 


of the relation y — ab*, where a and b are constants. 


The following data represents the strength of a primary school 
from 1961 to 1972. Calculate the trend values for the data by 
using the method of moving averages of period 3 years. Also 


draw the graph of the trend values. 

Year: 1961, 1962, 1963, 1964, 1965, 1966, 
Strength: 330, 320, 356, 360, 374, 392, 
Year: 1967, 1968, 1969, 1970, 1971, 1972 
Strength: 400, 420, 430, 438, 440, 460 


(П PUC-Mar. 1977) 
Obtain the trend values for the following data by the method 
of Semi-averages where the average is taken as (а) Median and 


(b) Mean 

Year: 1969, 1970, 1971, 1972. 1973, 1974, 
Consumption: 656, 804, 836, 765, "НЕ 
Year: 1975, 1976, 1977, 1978 
Consumptions; 755, 747, 696, 677 


711, 


21. 


22 


23. 


24. 


25. 


25. 


27. 
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Fit a straight line to the following data: 
x; 6 7 % 9 № їй, 1 аз 
pp rh бо 4, 4, 8, 2 1 
Fit a linear trend for the following data: 
Year: 1965, 1966, 1967, 1968. 1969 
Exports (0000 tons) 12, 15, 22, 27, 25 


Estimate the export figure for the year 1970. 
(II PUC-Oct. 1975) 
Fit a linear trend to the following data: 


Year: 1964, 1965, 1966, 1967, 1968, 1969, 1970 
Sales: 35, 42, 44, 48, 46, 49, 51 


Estimate the sales for the year 1972. (П PUC-Oct. 1977) 
Fit a straight line and parabolic curve to the following data: 


x: 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0 
x DLE "Xs. 4 5199. -8Жс di 


Fit a least square parabola, у = a + bx + сх?, to the follow- 
ing data: 


хл №0; Us 2, 3, 4, 3; 6 
у: 24, 24, 3.2, 5.6, 9:3:* 15:65." 219 


Show that the least square parabola for the following data to be 
y = 41.77 — 1.10x + 0.089x*: 


oct 20, 30, 40, 50, 60, 


70 
y: 54, 90, 138. 206, 292, 396 
Fit a least square curve of th = 7 
ши quare curve of the form у = ab* to the data given 
x 0 1 2: 3 4 
2 ELO UM 35. 9, зм 


Estimate the value of y when x — 5 
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28. Fit а curve of the type Y = ab* to the following data: 
X: 2: 3, 4, 55 6 
Үл 8.3, 154, 33.1, 65.2, 127.4 
Estimate Y when X = 4.5 and 3.5 

29. Fita curve, y = ax, to the following data: 
x: 1! 2: 3. 4, 5, 
y: Si, 4.8, 53, 6.2, 6.9, 7.8 


To 


10. 


. КБ) = 3; PE) = 1; PEE) = 1. 


. © 51 


Answers 


CHAPTER 1:1.12. PROBABILITY 


(i) S = {H T.) (Н,Н,) (737?) (71H3)) 
(ii) (H 1) (H 2) (НЗ) (H 4) (Н 5) (H 6) 

(Т1) (72) (T 3) (Т4) (75) (T6) 

(41) (А;6;) (R,G2) (R463) (Куб) (5,6) 


. (a) 48-40, 2, 3, 4, 5, 6, 7, 8, 9} 


(b) (AUB)! = (10, 11, 12} 
(с) 410281 (10, 11, 12) 
(d) AUB! = (1, 2, 3, 7, 8, 9, 10, 11, 12} 


. (а) 120 


Ф) 72 
(с) 12 
(а) 150 
(6) 45 
(c) 100 


‚ ANB) = $ 


. 0.29 


259 


36 
16 


200 PRE-UNIVERSITY STATISTICS-I! 
ОДЕ 

20. (1) 

Gi) 

21. = 


22. 0.2228 
23. 0.5766 


24.0) PABA =} 


G) P(AB/AUB) = 1 


25. (i) АВС 
(ii) ABC 
(й) ABUACUBC 
(iv) (ABC) 
(v) А1В1С1 
26. Probability that 3 or 4 breakages are caused by one girl 


13. 13 
=F by the youngest girl = 256 


27. 0.01 
28. (C) = p МС) = РС, ПС) = 


4 
D э» Коб) = = 


5 


> 


ANSWERS 


32. == 


33. 


34. 


35. 


36. 


37. 


38. 


39. 
40. 
41. 


42. 


43. 


44. 


45. 


(а) (0.95) 
(b) П- €95)5] 
0.0498 


ый 
29 


(b) 


0:0001486 


Ee 


(a) 


|" 


(b) 


0.34 


- 
- 
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46. 


47. 


49. 


50. 


6 

Е 
216 

0. 


TE 24 1 
Р(АВС) = 2-3 215 


о 
со 


(а) 75 


(b) 


7 terms; (аб + баз + 15a*b? + 20a?b? + 15a*b* + бабе -+ b°) 


КЕ 


(i) 0.1536 
(ii) 0.3456 
(1) 0.1731 
5 


5318 


CHAPTER 2: 2.5. RANDOM VARIABLE 


1 
· 2800 
. 0.50p 


. 0.101 
Б 113.33 


. E(X) = 12.47, E(X — Y? = 35.78, EQX + 5) = 29.94 
. E(X) = 2, ИХ) = 1 and E(X?) = 5 
. Mean = 2, S.D = 1 


13 


' 38 
. E(Y) = 64; V(Y) = 16.24 


25. 


12. 


13. 


14. 


15. 
16. 


· (а) 
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CHAPTER 3. PROBABILITY DISTRIBUTIONS 


p(X > 8) = 0.055 


9 55! 
= гр 209 = (55-05) (0.9) (0.1)98-х 


Mean = 49.5; Variance = 4.95 


= ale 


| 


(b) 


-- 
сом 


ool tn 


(c) 


. (a) 250 


(b) 25 
(c) 500 


0.25 


(а) (im) 


(b) 0.29525 


o (a) 
(a) 0.0256 
(b) 0.0081 
(c) 0.1008 
0.2458 
= 0.55 
Expected Frequency: 5, 26, 48, 39, 12 


203 
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17. Expected Frequency: 1, 10, 44, 117, 205, 246, 205, 117, 44, 
10, 1 


jo 1 2 8 4 5 
19. єг" р H үү + єр + т. ће др ер 


22. emm = 19 = 0.10; т — 2.3026 
23. 0.9817 

25. 0.0243 

26. 0.9802 

27. 022 


28. Odds 30 to 1 
29. (i) 0.00248 

Gi) 0.04462 

Gi) 0.1607 

(iv) 0.6964 
30. p(x) = 0.487 (0-72 E CY 0,19 3... 
. Probabilities: 0.3012, 0.3614, 0.2184, 0.0868, 0.0260, 0.0062 
32. Probabilities: 0.3679, 0.3679, 0.1840, 0.0613, 0.0153, 0.0031 
33. Expected Frequencies: 101,3, 120.25, 71.37, 28.25, 8 38, 2.45 
34. 61% 


7.5. NORMAL DISTRIBUTION 


cT (x — 22)? 
0 оо | 5025] 
1 (x—2y 
auae exp 6 ] 


9. (а) 
(b) 

(c) 

(d) 

(е) 

(f) 

(g) 

(в) 

10. (i) 
(ii) 


0.3256 
0.6406 
0.55 
0.4404 
0.1915 
0.0918 
0.4582 
0.50 
994 
394 


11. 0.0918; Rs. 15.08p 


12. (i) 
Gi) 
(iii) 

13. (i) 
(ii) 

14. (i) 
(i) 


159 

41 

15 

723 
48 
0.1587 
0.8185 


15. 0.9104 


16. (i) 
Gi) 
(iii) 
17. () 
(i) 


6.7% 
1248 

750 hours 
10.56% 
95% 
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18. Expected: 1.1, 4.0, 11.1, 23.9, 39.5, 50.2, 49.0, 36.6, 21.1 


9.4, 3.1, 1.0 
19. Mean: 48.04; S.D. = 16.31 


Expected frequency: 1, 6, 11, 26, 38, 24, 12, 9 3 
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CHAPTER 4: 4.8. LARGE SAMPLE TESTS 


1. а = 0.0114 
В = 0.1255 

2. 7 = 4.39, reject H,, the tyres are not as good аз is claimed 
3. Z = 3.04, reject Ho 

4. Z — 1.7637, Accept H, 

5. Z — 3.121, reject H, 

6. Z = 2.73, reject Hy 

7. Z = 17.42, reject Hy 

8. Z — 1.08, Accept H, 

9. 2 = 1.01, Accept Н, 

10. Z — 3.1134, reject H, 
11. Z = 3.8986, City А is more prone for accidents 
12. Z = 3.5317, reject H, 
13. Z = 0.1643; Accept Н, 
14. Z = 3.028, reject H, 


CHAPTER 5:5.6. VITAL STATISTICS 


12. 406.81 lakhs 


13. B.R.: 32.37, О.В. = 9.78, IMR = 87.83, Cause Spe D.R. 
= 11.24% Growth rate = 2.26% 


14. CBR = 29.98, CDR = 9.99; Growth Rate =2% 
15. CBR = 58,05, СОВ = 23.29 


18. S.D.R. of place 4 — 16.26 
S.D.R. of place B — 16.10 
19. 28.59 : 


20. 


12. 


16. 
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13.50 
CHAPTER 6: 6.10. INDEX NUMBERS 


Price relatives of 1984: 105, 106.59, 114.29, 111.36,140.91 
Price relatives of 1983: 98.75, 101.10, 107.14, 106.82, 118.18 


. 140.85, 100, 115.49, 82.68; 


121.95, 86.59, 100, 71.59 


. 217.07 

. 89.40 and 81.65 

. 30.42, 29.985, 29.982 

. 154.55 

. 24.79, 73.98, 1354.25 

. Fixed base: 100, 102.72, 104.82, 108.15, 111.83, 114.02, 


113.58, 116.83, 112.53, 132.43, 134.53 
Chain base:—102.72, 102.05, 103.18, 103.40, 101.96, 99.62 
102.85, 96.32, 117.68, 101.59 


. 100, 140, 182, 291.2, 524.16, 1048.32, 1886.98 
. 605.78 
. 548.68 


‚ 359.27 
. 75, 88.46, 113.21, 228.07, 309.27 


CHAPTER 7: 7.10. TIME SERIES 


. (а) 20 


(b) 21 
(с) 96 
495.75, 503.625, 511.625, 529.5, 553, 572.5 
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17: 
19. 


20. 


21. 
22. 
23: 
24. 


25. 
21. 
28. 
29. 


130.67, 144.67, 152.33, 165.67, 181.67, 206, 224.33, 257.67 


335.33, 345.33, 363.33, 375.33, 388.67, 404, 416.67, 429.33, 
436, 446: 


(а) 777,711 

(b) 767.6, 717.2 

y — 102.762 — 0.9048X 

Y 20.2 + 3.8X and 31.6 

y = 45 + 2286X and 56.43 

Y — — 0.1536 4- 1.02143X and 
Y = 0.558 + 0.345Х + 0.13552 
у = 2.51 — 1.12х + 0,733» 

y = 65.3 (1.425; 186.25 

у = 2.04 (1.995); 45.643 and 22.88 
у = 0.7445 (ху- 85; 1 


мы потеки то .vvov  .UVoi0 0090 —.U294 10982  .0571 ‚ 0559 
1.6 .0548° .0537 0526 .0516 .0505 .0495 .0485 .0475 .0465 ‚0455 
1.7 (0446  .0436 .0427  .0418 .0409 .0401  .0392  .0384  .0375 „0367 
1.8 0359  .0351 .0344  .0336 .0329 .0322  .0314  .0307 . 0301 . 0294 
1.9 .0287  .0281  .0274 0268  .0262  .0256  .0250 .0244  .0239 .0233 
2.0 .02275 .02222 .02169 .02118 .02068 .02018 .01970 .01923 .01876 . 01831 
2.1 01786 ` 01743 .01700 01659 .01618 .01578 .01539 .01500 .01463 .01426 
2:2 01390 .01855 .01321  .01287 .01255 .01222 .01191 .01160 .01130 .01101 
2.5 01072 .01044 .01017 .00990 .00964 .00939 .00914 .00889 .00866 .00842 
2.4 100820 .00798 .00776 .00755 .00734 .@ 7714 .00695 .00676 .00657 .00639 
2.5 .00621 .00604 .00587 .00570 .00554 .00539 .00523 .00508 .00494 .00480 
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3.9 00005 
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TABLE II. AREAS IN TAIL OF THE NORMAL DISRIBUTION 
эсе еа 6=== 


The function is 1—Ф (и) where Ф (и) is the cumulative distribu. 
tion function of a standardised Normal variable и. Thus 1—Ф (и) 
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